📝 Fix typos across handbook notes · davidgasquez.com/handbook@902361a

+1 -1

Artificial Intelligence Models.md

··· 17 17 - LLMs amplify existing expertise rather than replacing it. 18 18 - Be aware of training cut-off dates when using LLMs. 19 19 - "AIs" can be dangerous in underspecified environments (e.g: pausing games to last longer in the level) but those are the places where we will use them most. If something is well specified, there might be better solutions/optimizations (maths, code, ...). 20 - - [When the main purpose of writing is to demonstrate your thinking (building trust, applying for a job), don't use LLM output](https://x.com/HamelHusain/status/1976720326106173673). Use LLMs hen need to communicate info, or do admin stuff, where the person really just wants info and doesn't need to be convinced "how you think". LLMs are good at writing but bad at thinking. 20 + - [When the main purpose of writing is to demonstrate your thinking (building trust, applying for a job), don't use LLM output](https://x.com/HamelHusain/status/1976720326106173673). Use LLMs when you need to communicate info, or do admin stuff, where the person really just wants info and doesn't need to be convinced "how you think". LLMs are good at writing but bad at thinking. 21 21 - LLMs as "stateless functions". Fixed weights, no updating. LLMs are in-context learners. 22 22 23 23 ## Prompting

+3 -3

Data Culture.md

··· 8 8 - [Data's impact is tough to measure — it doesn't always translate to value](https://dfrieds.com/articles/data-science-reality-vs-expectations.html) 9 9 - The value of "insights" is often unknown. 10 10 - The Data Team should be building and iterating the [Data Product](https://locallyoptimistic.com/post/run-your-data-team-like-a-product-team/). 11 - - Notebooks are a workshop. Production systems are the factory. Not everything needs to be put into production. Not everything should be a notebook. You need both. Lean in to the strength of each. 11 + - Notebooks are a workshop. Production systems are the factory. Not everything needs to be put into production. Not everything should be a notebook. You need both. Lean into the strength of each. 12 12 - Data is fundamentally a collaborative design process rather than a tool, an analysis, or even a product. [Data works best when the entire feedback loop from idea to production is an iterative process](https://pedram.substack.com/p/data-can-learn-from-design). 13 13 - [To get buy in, explain how the business could benefit from better data](https://youtu.be/Mlz1VwxZuDs) (e.g: more and better insights). Start small and show value. 14 14 - Run _[Purpose Meetings](https://www.avo.app/blog/tracking-the-right-product-metrics)_ or [Business Metrics Review](https://youtu.be/nlMn572Dabc). ··· 69 69 - [So much of data work is about accumulating little bits of knowledge and building a shared context in your org so that it's possible to have the big, earth shattering revelations we all wish we could drive on a predictable schedule](https://twitter.com/imightbemary/status/1536368160961572864). 70 70 - A big purpose of data is knowledge. Knowledge is **"theories or models that allow you to predict the outcomes of your business actions"**. Insights may originate from data but are confirmed through actions. 71 71 - You won't have the best allocation of resources in a reactive team. Data teams need extra [[slack]]. [Balance user requests with actual needs](https://scientistemily.substack.com/p/product-management-skills-for-data). 72 - - Do weekly recaps in Slack in to highlight key items, company-wide progress toward north-stars, improvements in certain areas, new customer highlights. All positive and fun stuff. 72 + - Do weekly recaps in Slack to highlight key items, company-wide progress toward north-stars, improvements in certain areas, new customer highlights. All positive and fun stuff. 73 73 - How can we measure the data team impact? 74 74 - Making a [[Writing a Roadmap|roadmap]] can help you telling if you are hitting milestone deadlines or letting them slip. 75 75 - Embedded data team members need to help other teams build their roadmap too. ··· 99 99 - [Aim for a culture of celebrating measurable progress and learnings, versus celebrating shipping](https://erikbern.com/2021/07/07/the-data-team-a-short-story.html). 100 100 - Align company on key actions. Every stakeholder should know how to explore that data. 101 101 - Do pre-mortems. Where would we see the impact of _X_ going wrong? Model that and plot it on a dashboard. 102 - - You can force coordination by making a chart and start the discussion with it. Having a default chart will foce people to fight on the definition and also provides a starting point. Discussions are much better when there are based on data and definitions. 102 + - You can force coordination by making a chart and start the discussion with it. Having a default chart will force people to fight on the definition and also provides a starting point. Discussions are much better when there are based on data and definitions. 103 103 - Coordination happens when people agree on data, direction, and how to move to the desired place. 104 104 - [Send surveys](https://docs.google.com/forms/d/e/1FAIpQLSfufs_0zOGlFiE6oqrdZU7xCi399CBYbIlZkAMe15GTRRcPZA/viewform) from time to time trying to get pain points and know where issues are. 105 105 - E.g: Do you have access to the data I need to make decisions in your role?

+1 -1

Data Engineering.md

··· 25 25 - **[[Modularity]]**: Steps are independent, declarative, and [[Idempotence|idempotent]]. This makes pipelines composable. 26 26 - **Consistency**: Same conventions and design patterns across pipelines. If a failure is actionable by the user, clearly let them know what they can do. Schema on write as there is always a schema. 27 27 - **Efficiency**: Low event latency when needed. Easy to scale up and down. A user should not be able to configure something that will not work. Don't mix heterogeneous workloads under the same tooling (e.g: big data warehouses doing simple queries 95% of their time and 1 big batch once a day). 28 - - **Flexibility**: Steps change to conform data points. Changes don't stop the pipeline or losses data. Fail fast and upstream. 28 + - **Flexibility**: Steps change to conform to data points. Changes don't stop the pipeline or lose data. Fail fast and upstream. 29 29 30 30 ### Data Flow 31 31

+1 -1

Data Package Manager.md

··· 63 63 - **Declarative**. Transformations should be defined as code and be idempotent. Similar to how Pachyderm/Kamu/Holium work. 64 64 - E.g: The transformation tool ends up orchestrating containers/functions that read/write from the storage layer, Pachyderm style. 65 65 - **Environment agnostic**. Can be run locally and remotely. One machine or a cluster. Streaming or batch. 66 - - **Templated**. Having a repository/market of open transformations could empower a bunch of use cases ready to plug in to datasets: 66 + - **Templated**. Having a repository/market of open transformations could empower a bunch of use cases ready to plug into datasets: 67 67 - Detect outliers automatically on tabular data. 68 68 - Resize images. 69 69 - Normalize sound files.

+3 -3

Impact Evaluators.md

··· 86 86 - Individual Rationality: Ensuring that every participant has a non-negative utility (or at least no worse off) by participating in the mechanism. 87 87 - Budget Balance: The mechanism generates sufficient revenue to cover its costs or payouts, without running a net deficit. 88 88 - If you do something with a large "impact" and I do something with less "impact". It's clear you deserve more. How much more, is debatible. Depends on the goals of the organizers! 89 - - In most of the mechanisms working nowadays (e.g: [[Deep Funding]]), there are arbitrary decissions that affect the allocation. 89 + - In most of the mechanisms working nowadays (e.g: [[Deep Funding]]), there are arbitrary decisions that affect the allocation. 90 90 - Small rules might have a disproportionate impact. 91 91 - **Legible Impact Attribution**. Make contributions and their value visible. 92 92 - [Transform vague notions of "alignment" into measurable criteria](https://vitalik.eth.limo/general/2024/09/28/alignment.html) that projects can compete on. ··· 117 117 - **Make evaluation infrastructure permissionless**. Just as anyone can fork code, anyone should be able to fork evaluation criteria. This prevents capture and enables innovation. 118 118 - Anyone should be able to [fork the evaluation system with their own criteria](https://vitalik.eth.limo/general/2024/09/28/alignment.html), preventing capture and enabling experimentation. 119 119 - [IEs are the scientific method in disguise, like AI evals](https://eugeneyan.com/writing/eval-process/). 120 - - There are two areas of Impmact Evaluators where coordination is needed: allocation rules and mechanism selection. 120 + - There are two areas of Impact Evaluators where coordination is needed: allocation rules and mechanism selection. 121 121 - **Focus on error analysis**. Like in [LLM evaluations](https://hamel.dev/blog/posts/evals-faq/), understanding failure modes matters more than optimizing metrics. Study what breaks and why. 122 122 - IEs will have to do some sort of "error analysis". [Is the most important activity in LLM evals](https://hamel.dev/blog/posts/evals-faq/#q-why-is-error-analysis-so-important-in-llm-evals-and-how-is-it-performed). Error analysis helps you decide what evals to write in the first place. It allows you to identify failure modes unique to your application and data. 123 123 - **Reduce cognitive load for humans**. Let [algorithms handle scale while humans set direction and audit results](https://vitalik.eth.limo/general/2025/02/28/aihumans.html). 124 - - Use humans for sensing qualitative properties, machines for bookkeeping and preserve legitimacy by letting people choose/vote on the prefered evaluation mechanism. 124 + - Use humans for sensing qualitative properties, machines for bookkeeping and preserve legitimacy by letting people choose/vote on the preferred evaluation mechanism. 125 125 - Making it so people don't have to do something is cool. Making it so people can't do that thing is bad. E.g: time saving tools like AI is great but humans should be able to jump in if they want! 126 126 - If people don't want to have their "time saved" have the freedom to express themselves. E.g: offer pairwise comparisons by default but let people expand on feedback or send large project reviews. 127 127 - Information gathering is messy and noisy. It's hard to get a clear picture of what people think. Let people express themselves as much as they want.

+1 -1

Incentives.md

··· 20 20 - Intentional system design. 21 21 - Commitment to study the metric. 22 22 - Human values are highly dimensional. Nudging people in the right direction is hard, especially because nudges usually are very low dimensional. 23 - - [Incentives might be perverse and the effects will be unexpected and contrary to the intentions of its designers](https://en.wikipedia.org/wiki/Perverse_incentive) (e.g. Hacktoberfest spamming Pull Requests on Open Source repositories causing more work for mantainers). 23 + - [Incentives might be perverse and the effects will be unexpected and contrary to the intentions of its designers](https://en.wikipedia.org/wiki/Perverse_incentive) (e.g. Hacktoberfest spamming Pull Requests on Open Source repositories causing more work for maintainers). 24 24 25 25 ## Incentive Framework 26 26

+1 -1

Learning.md

··· 63 63 64 64 - We all have a web of concepts in our minds, our [[Knowledge Graphs]]. The collection of all the concepts we understand, all of our existing knowledge and intuitions, connected together. And you have learned something when you can convert it to concepts and **connect it to your existing understanding**. This means not just understanding the concept itself, but understanding where it fits into the bigger picture, where to use it, etc. 65 65 - Knowledge is not as a hunt for a single, ultimate and universal truth. [Is similar to the spirit of ecology: a gradual evolution of stably coexisting diversity that speciates and complexifies as it develops](https://www.radicalxchange.org/media/blog/why-i-am-a-pluralist/). 66 - - T[he world is full of things that merely appear good but aren't really](https://sarahconstantin.substack.com/p/what-goes-without-saying), and that it's important to vigilantly sift out the real from the fake. Concepts like Goodhart's Law, cargo-culting, greenwashing, hype cycles, Sturgeon's Law, ... are all pointing at the basic understanding that it's easier to seem good than to be good. 66 + - [The world is full of things that merely appear good but aren't really](https://sarahconstantin.substack.com/p/what-goes-without-saying), and that it's important to vigilantly sift out the real from the fake. Concepts like Goodhart's Law, cargo-culting, greenwashing, hype cycles, Sturgeon's Law, ... are all pointing at the basic understanding that it's easier to seem good than to be good. 67 67 68 68 ## Learning Soft Skills 69 69

+1 -1

Metrics.md

··· 55 55 - When output metrics are given as goals, teams can often focus on the wrong inputs or thrash between inputs. 56 56 - Focus on usage first (not revenue first). This is the most common version of outputs vs inputs. Usage creates revenue, revenue does not create usage. As a result, the most important metrics in terms of creating growth are not your revenue metrics, they are your usage metrics. 57 57 - On a similar note, there are leading and lagging indicators. Leading indicators are usually input metrics and are harder to measure. Lagging indicators are usually output metrics and easy to measure. 58 - - Mixing Up Retention and Engagement. Retention and engagement are not the same things. Retention is binary. It answers the question, was this person active within my defined time period? Yes or no. Engagement is is depth. It answers the question, how active were they within the defined timed period? 0→N. Engagement is one of three major inputs into driving retention. 58 + - Mixing Up Retention and Engagement. Retention and engagement are not the same things. Retention is binary. It answers the question, was this person active within my defined time period? Yes or no. Engagement is depth. It answers the question, how active were they within the defined timed period? 0→N. Engagement is one of three major inputs into driving retention. 59 59 - Customers vs Users. A customer and a user is not the same thing in most business models. A customer is defined as the person/group that is paying you. A user is a person using the product. 60 60 - In subscription products, oftentimes there are multiple users associated with a single customer. Or people are users before they are a customer. You need to separate the definition and language between these two things for teams to clearly act on them. 61 61 - You don't need perfect accuracy sometimes. Moving in the right direction counts (i.e. fitbit heartrate is off but variance is still useful).

+1 -1

Organizations.md

··· 54 54 - [As organizations become less efficient / less effective, they need more and more managers to "manage" that inefficiency. This kicks off a wicked cycle, because they'll self-identify with managing a problem ... which reinforces it.](https://twitter.com/johncutlefish/status/1472669773410410504) 55 55 - [Management in large, dysfunctional companies is a game about promising to ship things to people further up your chain](https://ludic.mataroa.blog/blog/brainwash-an-executive-today). Organizations select for people who support convenient narratives and can maintain positive messaging regardless of reality. 56 56 - It might be interesting to cap the core team size at N people (e.g: 15). Focus on solving one problem, and do it well. 57 - - [When you scale, you automate. This is good and bad. It's nice to be able to get a refund automatically when an item is missing from your order. It's frustrating trying to figure out the right incantation to trick a chatbot in to connecting you to a human. If you can afford it, don't scale past the number of users you can excellently serve. Don't scale to a point where you can't excellently polish your software.](https://samwho.dev/blog/scale-is-poison/) 57 + - [When you scale, you automate. This is good and bad. It's nice to be able to get a refund automatically when an item is missing from your order. It's frustrating trying to figure out the right incantation to trick a chatbot into connecting you to a human. If you can afford it, don't scale past the number of users you can excellently serve. Don't scale to a point where you can't excellently polish your software.](https://samwho.dev/blog/scale-is-poison/) 58 58 - [The art of org design is essentially effective iteration towards form-context fit. You need four sub-skills to do effective iteration](https://commoncog.com/blog/org-design-skill/). To get good at org design, you need to build more accurate models of the people in your org, learn how they respond to [[incentives]], and in build enough power and credibility to get your org changes to take place. 59 59 - Internally, you should not "sell", but truth seek. 60 60 - Transparency is a vital value. Transparency gets people to treat the company as their own.

+1 -1

Politics.md

··· 25 25 - Your perception of reality has probably been at least a little manipulated. Your opponents are behaving the way they are based on a perception of reality that's different from your own. 26 26 - What does this look/feel like to the people I don't know? 27 27 - Everyone belongs to a tribe and underestimates how influential that tribe is on their thinking. Tribes reduce the ability to challenge ideas or diversify your views because no one wants to lose support of the tribe. Tribes are as self-interested as people, encouraging [[ideas]] and narratives that promote their survival. But they're exponentially more influential than any single person. So tribes are very effective at promoting views that aren't analytical or rational, and people loyal to their tribes are very poor at realizing it. 28 - - Utopia can't be planed from scratch! Push decisions to the edges (localism) where they have [[incentives]] to make good choices. 28 + - Utopia can't be planned from scratch! Push decisions to the edges (localism) where they have [[incentives]] to make good choices. 29 29 - A good counter argument is that people might not be educated enough to make the best decision and a centralized institution could do it much better for them (e.g: a government banning lead from most products is credited with the most significant global drop in crime rates in decades). 30 30 - Most political debates are people with different time horizons talking over each other. 31 31 - [Liberalism has a few big economic problems](https://slatestarcodex.com/2017/02/22/repost-the-non-libertarian-faq/); [[coordination]] issues (Moloch), irrationality and lack of information.

+1 -1

Problem Solving.md

··· 40 40 - Keep the end goal in mind. [Don't Shave That Yak](https://seths.blog/2005/03/dont_shave_that/)! 41 41 - [The Copenhagen Interpretation of Ethics](https://laneless.substack.com/p/the-copenhagen-interpretation-of-ethics) says that when you observe or interact with a problem in any way, you can be blamed for it. At the very least, you are to blame for not doing _more_. ^ec616e 42 42 - Social problems demand social solutions. Not everything can be solved by technology. E.g: If you're skeptical about Wikipedia, you can easily create your own fork of Wikipedia and improve it. You'll have to deal with the social problems of convincing others to use your fork, etc. 43 - - To convince someone, show historical attemps failures and analyze them. 43 + - To convince someone, show historical attempts failures and analyze them. 44 44 - In complex problem spaces, [focus on direction](https://thecompendium.cards/c/everything/sort/stars/direction-not-solution) and not on the details. Don't focus too much on trying to find the best idea. The thing you have to prove is not that an idea is the best one, but that it's better than doing what you're currently doing. 45 45 - [You cannot solve a people problem with a technical solution. Most technical problems are really people problems](https://blog.joeschrag.com/2023/11/most-technical-problems-are-really.html). Most people problems are [[Coordination]] and [[Communication]] problems. 46 46 - All technology produces second and third order effects beyond what was intended, and [[Feedback Loops]] often magnify them in complex systems.

+1 -1

Processes.md

··· 22 22 - Low Friction. Simple processes are easier to understand and apply. [Trivial inconveniences usually have more implications than it seems](https://www.lesswrong.com/posts/reitXJgJXFzKpdKyd/beware-trivial-inconveniences). 23 23 - Short [[Feedback Loops]]. Show the results as soon as you can. 24 24 - [[Idempotence|Idempotent]] processes are easy to manage 25 - - Write it down. Writting what's happening can be a giant leap forward in terms of getting people to agree on what the process actually is. 25 + - Write it down. Writing what's happening can be a giant leap forward in terms of getting people to agree on what the process actually is. 26 26 27 27 A process takes an input to produce an output. Group of processes can be viewed as [[systems]]. 28 28

+2 -2

Programming.md

··· 47 47 - **Data is only useful as long as it's being used**. 48 48 - Flat files help ensure that data is usable for the longest possible time. 49 49 - For complex data structures where plain text really isn't appropriate, use a structured text format instead. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. [Data structures, not algorithms, are central to programming](https://users.ece.utexas.edu/~adnan/pike.html). 50 - - **A programmer who can't re-use other programs is condemned to re-write them**. 50 + - **A programmer who can't reuse other programs is condemned to re-write them**. 51 51 - Use software leverage to your advantage. 52 - - Many programmers have only a superficial understanding of the importance of re-usable code modules. 52 + - Many programmers have only a superficial understanding of the importance of reusable code modules. 53 53 - [Code isn't just meant to be executed. Code is also a means of [[Communication]] across a team, a way to describe to others the solution to a problem](https://medium.com/s/story/notes-to-myself-on-software-engineering-c890f16f4e4d). Good writing skills often correlate with good thinking and programming skills. [Sharing knowledge through writing (blogs, talks, documentation, open source) clarifies your thinking and helps others](https://endler.dev/2025/best-programmers/). 54 54 - **Silence is golden**. 55 55 - A silent command is often more usable, providing the function asked for and nothing more.

+2 -2

Reverse ETL.md

··· 14 14 - You can use the real source of truth for all the events and not rely on tracking only. 15 15 - You can join sources like ChartMogul, Customer.io, etc, 16 16 - You can create more interesting events by enriching the events and user profiles with extra properties/traits (Trial Started with a conversion provability attached). Makes product analytics much more powerful. 17 - - It is much easier to re-use the data available in the warehouse than it is to import the data in any new tool we use in the future. 17 + - It is much easier to reuse the data available in the warehouse than it is to import the data in any new tool we use in the future. 18 18 - You can be much more flexible with the tools we want to use because the data is shared and owned by us. 19 - - You avoid being locked in to BI tools like Mixpanel since the logic will be stored in our warehouse. 19 + - You avoid being locked into BI tools like Mixpanel since the logic will be stored in our warehouse. 20 20 - As any new tool, it gives more flexibility and power. 21 21 - The current state is the starting point! We start using it to fix some issues or add some interesting profile properties

+2 -2

Teamwork.md

··· 76 76 5. [[Automation|Automate]] and keep standards. 77 77 - Keep great global [[coordination]] and incentive local experimentation. 78 78 - Being able to run small and compounding experiments (on the product or company [[processes]] and systems) is important. **Work smaller**. 79 - - [Some experiments won't work](https://www.lesswrong.com/posts/97LgacucCxmyjYiNT/the-archipelago-model-of-community-standards). But oftentimes it _feels_ like it wont work when in fact you just haven't stuck with it long enough for it to bear fruit. This is hard enough for _solo_ experiments. For group experiments, where not just one but _many_ people must all try a thing at once and _get good at it_, all it takes is a little defection to spiral into a mass exodus. 79 + - [Some experiments won't work](https://www.lesswrong.com/posts/97LgacucCxmyjYiNT/the-archipelago-model-of-community-standards). But oftentimes it _feels_ like it won't work when in fact you just haven't stuck with it long enough for it to bear fruit. This is hard enough for _solo_ experiments. For group experiments, where not just one but _many_ people must all try a thing at once and _get good at it_, all it takes is a little defection to spiral into a mass exodus. 80 80 - The group with the most power determine the system that reflect and reinforce their own way of thinking. Aim for inclusion. _Diversity is being invited to the party. Inclusion is being asked to dance and help organizing the party_. 81 81 - [Brainstorm for questions first (explore). Then find the answers (exploit).](http://web.archive.org/web/20240522210302/https://getpocket.com/explore/item/better-brainstorming) 82 82 - Strive for constructive conflict. Get people to [[Asking Questions|ask questions]]. Engage in passionate, unfiltered debate about what you need to do to succeed. ··· 116 116 - You have to put in more effort to make something appear effortless. Effortless, elegant performances are often the result of a large volume of effortful. Praise this instead of complex solutions. 117 117 - Invisible work will happen. If you're doing it, make an effort to share and get credit for it. Build a narrative (story) for your work. Arm your manager and fight recency bias keeping track of all the things you've done. 118 118 - As a manager, give problems to solve, not solutions. Make sure the team knows what they're working toward and that it has the resources needed to complete the work. 119 - - Most software or processes should be opinionated. In increases [[Coordination|collaboration]]. Flexible processes lets everyone invent their own workflows, which eventually creates chaos as teams scale. 119 + - Most software or processes should be opinionated. It increases [[Coordination|collaboration]]. Flexible processes lets everyone invent their own workflows, which eventually creates chaos as teams scale. 120 120 - As teams scale, traditional approaches to decision making force a tradeoff between transparency and efficiency. 121 121 - The easiest way to ensure everyone can understand the how and why of a decision is to adopt systems that, through their daily operation, ensure such context is automatically and readily available to those who might want it (and explicitly not only those who presently need it). 122 122 - [Run 1:1s (one-on-ones)](https://erik.wiffin.com/posts/how-to-get-the-most-out-of-your-11s/). A recurring meeting with no set agenda between a manager and one of their reports. Don't make it a status update (these should be async). Chat about anything bothering you, career growth or type of work that is interesting for you. End it with actionable next steps.

+1 -1

Themes.md

··· 29 29 30 30 These are the 3 themes I try to keep in mind at my home. 31 31 32 - - No bad knifes. 32 + - No bad knives. 33 33 - Make the right thing to do the easiest thing to do. 34 34 - Make easy to [[Automation|automate]] stuff.

Configure Feed

Configure Feed