Tools that have been useful to me in the past, or might be useful in the future, loosely grouped. Not things I use daily.
🦋 Web scrapers
Curl impersonate
A special build of curl that can impersonate the four major browsers: Chrome, Edge, Safari & Firefox. curl-impersonate is able to perform TLS and HTTP handshakes that are identical to that of a real browser.
Finally got around to trying this (I wanted a copy of a site that’s behind Cloudflare).
It works pretty well, and it can be driven from Scrapy with scrapy-impersonate, but these days, more often than not you want the page after JS has messed with it. So in the end I used scrapy-playwright to do the job. Minimal example:
You also need a couple of tweaks to settings.py
:
Puppeteer heap snapshot
puppeteer-heap-snapshot is a Node.js module that, given a Puppeteer browser page, can capture and parse a heap snapshot and deserialize objects that contain a set of properties. It comes with a nifty CLI tool too so we can quickly prototype scrapers from our terminal.
Instead of trying to use CSS selectors to get at the data we want, we grab the data straight out of the browser’s working memory.
grab-site is an easy preconfigured web crawler designed for backing up websites. Give grab-site a URL and it will recursively crawl the site and write WARC files. Internally, grab-site uses a fork of wpull for crawling.
Note
Still looking for something that can maintain an archive of a Facebook group.
🐌 Convert raster to vector (unsorted)
https://www.vectorization.org/
https://www.visioncortex.org/vtracer/
https://online.rapidresizer.com/tracer.php
https://fconvert.com/autotrace/
https://online-converting.com/autotrace/
https://github.com/fromtheexchange/image2svg-awesome
📈 Convert text to diagram
JavaScript based diagramming and charting tool that renders Markdown-inspired text definitions to create and modify diagrams dynamically.
Not the prettiest output, but the dot language makes generating large diagrams really simple. I use it whenever I need to parse structure (eg library dependencies) into a graphical overview.
Consumes dot files and lets you query them. What’s the shortest path between A and B? What relies on C?
A more complete list.
🔪 Edit PDFs
This is absolutely fantastic. A whole bunch of PDF tools, slapped in a Docker container with a simple and effective web interface on top. Less than five minutes to get running. I used it to generate my John Pory PDF.
🦀 Shell scripting
A collection of useful utilities for enhancing shell scripts.
Dotfile (Unix config file) manager. Maybe it’s better than a github repo, maybe it’s just another yak waiting to be shaved.
Set environment variables based on path.
🦚 Graphics libraries
Rough.js is a small (<9kB gzipped) graphics library that lets you draw in a sketchy, hand-drawn-like, style. The library defines primitives to draw lines, curves, arcs, polygons, circles, and ellipses. It also supports drawing SVG paths.
Programming
Code editor. Clean. Fast. Configurable key bindings. I don’t use one tenth of the features of an IDE, so this might be perfect for me.
Absolutely fantastic side-by-side command-line diff. Syntax-based, rather than character-based. My default choice.
🌀 Other
Battery charge limiter for macs.
Makes your MBP sound like a clicky-clacky buckling spring keyboard. I know it sounds like a joke. It probably is a joke. But for some reason it really helps me focus.
Enhance the capabilities of Amazon’s Fire tablets (requires Windows).
Transcribe audio, $4 per hour. Note: if you’re in the UK it’ll charge you £4 per hour (currently a 28% surcharge). I have a few hundred hours of audio to transcribe and that leaves a bad taste, so I’m going to try using the tools directly rather than paying the convenience tax.