···3636 3. An anecdote from your own work that provides rich texture and context for what you do.
3737 4. Some open questions that invite people to discuss.
3838- [Keep making your post more opinionated until it reflects your true beliefs](https://twitter.com/HamelHusain/status/1751995737095709164). Don't hedge. People want to hear what you think!
3939-- The title gets read way more than the rest, so make it count.
3939+- The title gets read way more than the rest, [so make it count](https://dynomight.net/titles/).
4040+ - Think of the title as a "classifier": it should attract those who will like the content and deter those who won't.
4141+ - Use specific language that signals to your target audience that the content is for them.
4242+ - The title can also convey your writing style and tone.
4343+ - Be cautious with overly clever or punny titles if you don't have an established audience.
4444+ - Consider title-driven creation: first choose a compelling title, then write content that delivers on its promise.
4045- The inverted pyramid works well for blog posts. Put the tweet-length version of the post in the title or first paragraph. Get to the point quickly, then elaborate. Readers can bail out at any point of the text and still take home most of what mattered, while the meticulous crowd can have their nitpicks addressed toward the end.
4146- [You're not just writing for today's invisible audience](https://web.archive.org/web/20250219111210/https://andysblog.uk/why-blog-if-nobody-reads-it/). You're writing for:
4247 - Future you. Your posts become a time capsule of your evolving mind.
+3-1
Open Data.md
···22222323Iterative improvements over public datasets yield large amounts of value ([check how Dune did it with blockchain data](https://dune.com/blog/the-community-data-platform))ยน. Access to data gives people the opportunity to create new business and make better decisions. Data is vital to understanding the world and improving public welfare. Metcalfe's Law applies to data too. The more connected a dataset is to other data elements, the more valuable it is.
24242525+In the blockchain example, data is Open, Verifiable, and Useful. And yet, the main provider of data is Dune, a company that captured most of the data layer. Users can run `cryo` but there are no incentives for them to share the data. There isn't a matchmaking market for data and people are forced to repeat the same work.
2626+2527Open Source code has made a huge impact in the world. Let's make Open Data do the same! Open data is, essentially, public infrastructure (similar to roads, bridges, or the internet). Let's make it possible for [anyone to fork and re-publish fixed, cleaned, reformatted datasets as easily as we do the same things with code](https://juan.benet.ai/blog/2014-02-21-data-management-problems/).
26282729This document is a collection of ideas and principles to make Open Data more accessible, maintainable, and useful. Also, recognizing that a lot of people are already working on this, there are some amazing datasets, tools, and organizations out there, and, that Open Data is a people problem at 80%. This document is biased towards the technical side of things, as I think that's where I can contribute the most. I believe we can do much more with the available data.
···3537During the last few years, a large number of new data and open source tools have emerged. There are new query engines (e.g: DuckDB, DataFusion, ...), execution frameworks (WASM), data standards (Arrow, Parquet, ...), and a growing set of open data marketplaces (Datahub, HuggingFace Datasets, Kaggle Datasets). "Small data" deserves more tooling and people working on it. There are many novel technologies too.
36383739- [Differential Privacy](https://en.wikipedia.org/wiki/Differential_privacy) that allows releasing statistical information about datasets while protecting the privacy of individual data subjects.
3838-- Homomorphic encryption.
4040+- Fully Homomorphic Encryption.
3941- New [deidentification techniques](https://www.ipc.on.ca/sites/default/files/legacy/2016/08/Deidentification-Guidelines-for-Structured-Data.pdf).
4042- Data watermarking, fingerprinting, and provenance tracking with blockchains.
4143- Better CPUs, compression algorithms, and storage technologies.