this repo has no description
1{
2 "id": "https://ryan.freumh.org/claude-code.html",
3 "title": "A Week With Claude Code",
4 "link": "https://ryan.freumh.org/claude-code.html",
5 "updated": "2025-04-21T00:00:00",
6 "published": "2025-04-21T00:00:00",
7 "summary": "<div>\n \n <span>Published 21 Apr 2025.</span>\n \n \n </div>\n \n \n\n <p><span>I tried using <a href=\"https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview\">Claude\nCode</a> while writing <a href=\"caledonia.html\">Caledonia</a>, and these\nare the notes I took on the experience. It’s possible some of the\ndeficiencies are due to the model’s smaller training set of OCaml code\ncompared to more popular languages, but there’s <a href=\"https://www.youtube.com/watch?v=0ML7ZLMdcl4\">work being done</a>\nto improve this situation.</span></p>\n<p><span>It needs a lot of hand-holding, often finding it\nvery difficult to get out of simple mistakes. For example, it frequently\nforgot to bracket nested match statements,</span></p>\n<div><pre><code><span><a href=\"#cb1-1\"></a><span>match</span> expr1 <span>with</span></span>\n<span><a href=\"#cb1-2\"></a>| Pattern1 -></span>\n<span><a href=\"#cb1-3\"></a> <span>match</span> expr2 <span>with</span></span>\n<span><a href=\"#cb1-4\"></a> | Pattern2a -> result2a</span>\n<span><a href=\"#cb1-5\"></a> | Pattern2b -> result2b</span>\n<span><a href=\"#cb1-6\"></a>| Pattern2 -> result2</span></code></pre></div>\n<p><span>and it found it difficult to fix this as the\ncompiler error message only showed the line with <code>Pattern2</code>. An interesting note here is that tools\nthat are easy for humans to use, e.g. with great error messages, are\nalso easy for the LLM to use. But, unlike (I hope) a human, even after\nadding a rule to avoid this in <code>CLAUDE.md</code>\nit frequently ignored it.</span></p>\n<p><span>It often makes code very verbose or inelegant,\nespecially after repeated rounds of back-and-forth with the compiler. It\nrarely shortens code, whereas some of the best changes I make to\ncodebases have a negative impact on the lines of code (LoC) count. I\nthink this is how you end up with <a href=\"https://news.ycombinator.com/item?id=43553031\">35k LoC</a> recipe\napps, and I wonder how maintainable these codes bases will\nbe.</span></p>\n<p><span>If you give it a high level task, even after\ncreating an architecture plan, it often makes poor design decisions that\ndon’t consider future scenarios. For example, it combined all the <code>.ics</code> files into a single calendar which when it\ncomes to modifying events them will make it impossible to write edits\nback. Another example of where it unnecessarily constrained interfaces\nwas by making query and sorting parameters variants, whereas <a href=\"https://github.com/RyanGibb/caledonia/commit/d97295ec46699fbe91fd4c15f9eef10b80c136f1#diff-08751a7fee23e5d1046033b7792d84a759ea253862ba382a492d0621727a097c\">porting</a>\nto a lambda and comparator allowed for more expressivity with the same\nbrevity.</span></p>\n<p><span>But while programming I often find myself doing a\nlot of ‘plumbing’ things through, and it excels at these more mundane\ntasks. It’s also able to do more intermediate tasks, with some back and\nforth about design decision. For example, once I got the list command\nworking it was able to get the query command working without me writing\nany code – just prompting with design suggestions like pulling common\nparameters into a separate module (see the verbosity point again).\nAnother example of a task where it excels is writing command line\nargument parsing logic, with more documentation than I would have the\nwill to write myself.</span></p>\n<p><span>It’s also awesome to get it to write tests where I\nwould never otherwise for a personal project, even with the above\ncaveats applying to them. It also gives the model something to check\nagainst when making changes, though when encountering errors with tests\nit tends to change the test to be incorrect to pass the compiler, rather\nthan fixing the underlying problem.</span></p>\n<p><span>It’s somewhat concerning that this agent is running\nwithout any sandboxing. There is some degree of control over what\ndirectories it can access, and what tools it can invoke, but I’m sure a\nsufficiently motivated adversary could trivially get around all of them.\nWhile deploying <a href=\"enki.html\">Enki</a> on <a href=\"https://github.com/RyanGibb/nixos/tree/master/hosts/hippo\">hippo</a>\nI tested out using it to change the NixOS config, and after making the\nchange it successfully invoked <code>sudo</code> to do\na <code>nixos-rebuild switch</code> as I had just used\nsudo myself in the same shell session. Patrick’s work on <a href=\"https://patrick.sirref.org/shelter/index.xml\">shelter</a> could\nprove invaluable for this, while also giving the agent ‘rollback’\ncapabilities!</span></p>\n<p><span>Something I’m wondering about while using these\nagents is whether they’ll just be another tool to augment the\ncapabilities of software engineers; or if they’ll increasingly replace\nthe need for software engineers entirely.</span></p>\n<p><span>I tend towards the former, but only time will\ntell.</span></p>\n<p><span>If you have any questions or comments on this feel\nfree to <a href=\"about.html#contact\">get in touch</a>.</span></p>",
8 "content": "<div>\n \n <span>Published 21 Apr 2025.</span>\n \n \n </div>\n \n \n\n <p><span>I tried using <a href=\"https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview\">Claude\nCode</a> while writing <a href=\"caledonia.html\">Caledonia</a>, and these\nare the notes I took on the experience. It’s possible some of the\ndeficiencies are due to the model’s smaller training set of OCaml code\ncompared to more popular languages, but there’s <a href=\"https://www.youtube.com/watch?v=0ML7ZLMdcl4\">work being done</a>\nto improve this situation.</span></p>\n<p><span>It needs a lot of hand-holding, often finding it\nvery difficult to get out of simple mistakes. For example, it frequently\nforgot to bracket nested match statements,</span></p>\n<div><pre><code><span><a href=\"#cb1-1\"></a><span>match</span> expr1 <span>with</span></span>\n<span><a href=\"#cb1-2\"></a>| Pattern1 -></span>\n<span><a href=\"#cb1-3\"></a> <span>match</span> expr2 <span>with</span></span>\n<span><a href=\"#cb1-4\"></a> | Pattern2a -> result2a</span>\n<span><a href=\"#cb1-5\"></a> | Pattern2b -> result2b</span>\n<span><a href=\"#cb1-6\"></a>| Pattern2 -> result2</span></code></pre></div>\n<p><span>and it found it difficult to fix this as the\ncompiler error message only showed the line with <code>Pattern2</code>. An interesting note here is that tools\nthat are easy for humans to use, e.g. with great error messages, are\nalso easy for the LLM to use. But, unlike (I hope) a human, even after\nadding a rule to avoid this in <code>CLAUDE.md</code>\nit frequently ignored it.</span></p>\n<p><span>It often makes code very verbose or inelegant,\nespecially after repeated rounds of back-and-forth with the compiler. It\nrarely shortens code, whereas some of the best changes I make to\ncodebases have a negative impact on the lines of code (LoC) count. I\nthink this is how you end up with <a href=\"https://news.ycombinator.com/item?id=43553031\">35k LoC</a> recipe\napps, and I wonder how maintainable these codes bases will\nbe.</span></p>\n<p><span>If you give it a high level task, even after\ncreating an architecture plan, it often makes poor design decisions that\ndon’t consider future scenarios. For example, it combined all the <code>.ics</code> files into a single calendar which when it\ncomes to modifying events them will make it impossible to write edits\nback. Another example of where it unnecessarily constrained interfaces\nwas by making query and sorting parameters variants, whereas <a href=\"https://github.com/RyanGibb/caledonia/commit/d97295ec46699fbe91fd4c15f9eef10b80c136f1#diff-08751a7fee23e5d1046033b7792d84a759ea253862ba382a492d0621727a097c\">porting</a>\nto a lambda and comparator allowed for more expressivity with the same\nbrevity.</span></p>\n<p><span>But while programming I often find myself doing a\nlot of ‘plumbing’ things through, and it excels at these more mundane\ntasks. It’s also able to do more intermediate tasks, with some back and\nforth about design decision. For example, once I got the list command\nworking it was able to get the query command working without me writing\nany code – just prompting with design suggestions like pulling common\nparameters into a separate module (see the verbosity point again).\nAnother example of a task where it excels is writing command line\nargument parsing logic, with more documentation than I would have the\nwill to write myself.</span></p>\n<p><span>It’s also awesome to get it to write tests where I\nwould never otherwise for a personal project, even with the above\ncaveats applying to them. It also gives the model something to check\nagainst when making changes, though when encountering errors with tests\nit tends to change the test to be incorrect to pass the compiler, rather\nthan fixing the underlying problem.</span></p>\n<p><span>It’s somewhat concerning that this agent is running\nwithout any sandboxing. There is some degree of control over what\ndirectories it can access, and what tools it can invoke, but I’m sure a\nsufficiently motivated adversary could trivially get around all of them.\nWhile deploying <a href=\"enki.html\">Enki</a> on <a href=\"https://github.com/RyanGibb/nixos/tree/master/hosts/hippo\">hippo</a>\nI tested out using it to change the NixOS config, and after making the\nchange it successfully invoked <code>sudo</code> to do\na <code>nixos-rebuild switch</code> as I had just used\nsudo myself in the same shell session. Patrick’s work on <a href=\"https://patrick.sirref.org/shelter/index.xml\">shelter</a> could\nprove invaluable for this, while also giving the agent ‘rollback’\ncapabilities!</span></p>\n<p><span>Something I’m wondering about while using these\nagents is whether they’ll just be another tool to augment the\ncapabilities of software engineers; or if they’ll increasingly replace\nthe need for software engineers entirely.</span></p>\n<p><span>I tend towards the former, but only time will\ntell.</span></p>\n<p><span>If you have any questions or comments on this feel\nfree to <a href=\"about.html#contact\">get in touch</a>.</span></p>",
9 "content_type": "html",
10 "categories": [],
11 "source": "https://ryan.freumh.org/atom.xml"
12}