<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Strong Words: Case Studies]]></title><description><![CDATA[Fast AI Training in action.]]></description><link>https://words.strongcompute.com/s/case-studies</link><image><url>https://substackcdn.com/image/fetch/$s_!Rmo5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F731e7c1d-8da7-4129-9323-70d0f5f1e0f3_2046x2046.jpeg</url><title>Strong Words: Case Studies</title><link>https://words.strongcompute.com/s/case-studies</link></image><generator>Substack</generator><lastBuildDate>Sat, 02 May 2026 18:21:06 GMT</lastBuildDate><atom:link href="https://words.strongcompute.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Strong Compute]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[strongcomputewords@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[strongcomputewords@substack.com]]></itunes:email><itunes:name><![CDATA[Strong Compute]]></itunes:name></itunes:owner><itunes:author><![CDATA[Strong Compute]]></itunes:author><googleplay:owner><![CDATA[strongcomputewords@substack.com]]></googleplay:owner><googleplay:email><![CDATA[strongcomputewords@substack.com]]></googleplay:email><googleplay:author><![CDATA[Strong Compute]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Scaling from 5 to 256 GPUs with zero dev-ops in one week. ]]></title><description><![CDATA[Accelerating Medical AI: How LayerJot Transformed Infrastructure Management with Strong Compute]]></description><link>https://words.strongcompute.com/p/scaling-from-5-to-256-gpus-with-zero</link><guid isPermaLink="false">https://words.strongcompute.com/p/scaling-from-5-to-256-gpus-with-zero</guid><dc:creator><![CDATA[Strong Compute]]></dc:creator><pubDate>Mon, 02 Feb 2026 00:05:10 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b4a501a0-c0e8-435d-a917-48861cfd1d24_2240x1260.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Without Strong Compute this would have taken 2 full time engineers 3-6 months.</p><p><strong>Before</strong></p><ul><li><p>On-premises compute hardware limited to 5 NVIDIA GPUs</p></li><li><p>Slow job migration and deployment between cloud providers</p></li><li><p>Limited visibility into resource utilization</p></li><li><p>High operational overhead managing compute resources</p></li></ul><p><strong>After</strong></p><ul><li><p>On-premises compute hardware limited to 5 NVIDIA GPUs</p></li><li><p>Slow job migration and deployment between cloud providers</p></li><li><p>Limited visibility into resource utilization</p></li><li><p>High operational overhead managing compute resources</p></li></ul><ul><li><p>44 experiments run across 6 separate AI projects - 23 rapid iteration experiments, 21 long-run training experiments</p></li><li><p>6.5 hours total training time on 256 GPUs in 90 cloud machines across 3 different cloud providers - including H100 and A100 instances</p></li></ul><h2><strong>Challenge: Complex AI workloads, scarce hardware</strong></h2><p>LayerJot, a cutting-edge med-tech startup in Belmont, CA, faced a critical challenge common to AI-driven research teams: managing complex, compute-intensive workloads across multiple datasets and models.</p><p>LayerJot&#8217;s projects span:</p><ul><li><p>Computer vision for medical equipment catalog processing</p></li><li><p>Multi-modal AI models like CLIP and Llama</p></li><li><p>Generalist robot policy models for surgical equipment handling</p></li></ul><h2><strong>Solution: Scaling from 5 to 256 GPUs with zero dev-ops</strong></h2><p>Strong Compute deployed an AI engineer on-site with LayerJot for a full week, working shoulder-to-shoulder with their team to optimize infrastructure and accelerate their AI workloads using the Strong Compute Instant Super Computer.</p><p></p><h2><strong>Technical Deep Dive: Datasets and Model Adaptation</strong></h2><h4>Data Ingested</h4><div id="datawrapper-iframe" class="datawrapper-wrap outer" data-attrs="{&quot;url&quot;:&quot;https://datawrapper.dwcdn.net/Il053/5/&quot;,&quot;thumbnail_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25b56f68-5b09-458e-8b8c-abe8ff3d5795_1220x430.png&quot;,&quot;thumbnail_url_full&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0707cf03-5699-434f-a49e-e362e438597a_1220x500.png&quot;,&quot;height&quot;:205,&quot;title&quot;:&quot;Created with Datawrapper&quot;,&quot;description&quot;:&quot;&quot;}" data-component-name="DatawrapperToDOM"><iframe id="iframe-datawrapper" class="datawrapper-iframe" src="https://datawrapper.dwcdn.net/Il053/5/" width="730" height="205" frameborder="0" scrolling="no"></iframe><script type="text/javascript">!function(){"use strict";window.addEventListener("message",(function(e){if(void 0!==e.data["datawrapper-height"]){var t=document.querySelectorAll("iframe");for(var a in e.data["datawrapper-height"])for(var r=0;r<t.length;r++){if(t[r].contentWindow===e.source)t[r].style.height=e.data["datawrapper-height"][a]+"px"}}}))}();</script></div><h4>Models Adapted</h4><div id="datawrapper-iframe" class="datawrapper-wrap outer" data-attrs="{&quot;url&quot;:&quot;https://datawrapper.dwcdn.net/bC5bv/2/&quot;,&quot;thumbnail_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d27af8b6-0ca8-411c-8d76-b2d0a7eb1298_1220x764.png&quot;,&quot;thumbnail_url_full&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e2a42a1-57c6-4501-bda9-16b75111a7f2_1220x764.png&quot;,&quot;height&quot;:372,&quot;title&quot;:&quot;Created with Datawrapper&quot;,&quot;description&quot;:&quot;&quot;}" data-component-name="DatawrapperToDOM"><iframe id="iframe-datawrapper" class="datawrapper-iframe" src="https://datawrapper.dwcdn.net/bC5bv/2/" width="730" height="372" frameborder="0" scrolling="no"></iframe><script type="text/javascript">!function(){"use strict";window.addEventListener("message",(function(e){if(void 0!==e.data["datawrapper-height"]){var t=document.querySelectorAll("iframe");for(var a in e.data["datawrapper-height"])for(var r=0;r<t.length;r++){if(t[r].contentWindow===e.source)t[r].style.height=e.data["datawrapper-height"][a]+"px"}}}))}();</script></div><h2><strong>On-Site Collaboration: Beyond Infrastructure Management</strong></h2><p>For one intensive week, Strong Compute embedded an AI engineer directly at LayerJot&#8217;s Belmont, CA office. Our engineer worked side-by-side with LayerJot&#8217;s team, providing:</p><ul><li><p>Real-time infrastructure optimization</p></li><li><p>Hands-on model adaptation support</p></li><li><p>Direct troubleshooting of complex AI workload challenges</p></li><li><p>Custom infrastructure configuration tailored to LayerJot&#8217;s unique research needs</p></li></ul><h3><strong>Key Outcomes</strong></h3><ul><li><p>Resolved Dense Encoder code base issues and successfully ran experiments</p></li><li><p>Adapted CLIP-style model for Strong Compute checkpointing</p></li><li><p>Successfully trained VLA Robotics  repo in interactive containers</p></li><li><p>Integrated model checkpoints from ingested datasets</p></li><li><p>Demonstrated Claude Code&#8217;s capability to adapt complex legacy code bases for training on Strong Compute!</p></li></ul><h2><strong>Breakthrough Results</strong></h2><h3><strong>Performance Metrics</strong></h3><ul><li><p>Reduced job deployment time from hours to minutes</p></li><li><p>60GB/sec inter-cloud data transfer speed</p></li><li><p>7.8-second container launch times</p></li></ul><h3><strong>Operational Impact</strong></h3><ul><li><p>Resolved complex code base integration challenges</p></li><li><p>Enabled continuous experiment-based training</p></li><li><p>Simplified multi-provider infrastructure management</p></li></ul><h2><strong>Quote from the Customer</strong></h2><p>&#8220;Strong Compute transformed how we think about infrastructure. It&#8217;s not just a tool; they are a strategic partner in our AI development.&#8221; - Soren Harner, CEO, LayerJot</p><h2><strong>Looking Forward</strong></h2><p>LayerJot is now positioned to:</p><ul><li><p>Scale AI research more rapidly</p></li><li><p>Reduce infrastructure management overhead</p></li><li><p>Accelerate medical technology innovation</p></li></ul><p><a href="https://cp.strongcompute.ai/">Try Strong Compute Today</a></p><p><em>Strong Compute: Complete Command and Control for GPU Compute</em></p>]]></content:encoded></item><item><title><![CDATA[Arcified.AI Winning Playbook for Strong Compute ARC AGI 2 Hackathon]]></title><description><![CDATA[ML Engineers at 2K Games and Google DeepMind built ARC Evolve, solving 80% of training puzzles&#8212;far surpassing frontier models.]]></description><link>https://words.strongcompute.com/p/arcifiedai-winning-playbook-for-strong</link><guid isPermaLink="false">https://words.strongcompute.com/p/arcifiedai-winning-playbook-for-strong</guid><dc:creator><![CDATA[Strong Compute]]></dc:creator><pubDate>Wed, 18 Jun 2025 23:32:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AGjj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Strong&#8239;Compute hosted a 24&#8209;hour, round&#8209;the&#8209;clock sprint focused on the <strong><a href="https://arcprize.org/">ARC&#8239;AGI&#8239;2</a></strong><a href="https://arcprize.org/"> challenge</a>. When the dust settled, the overall prize in <em>Competition&#8239;A</em> went to a two&#8209;person team operating under <strong>Arcified&#8239;.AI</strong>.</p><p>Arcified&#8217;s members&#8212;<strong><a href="https://www.linkedin.com/in/vijaygohil/">Vijay&#8239;raj Gohil</a></strong>, a ML Engineer at 2K&#8239;Games, and <strong><a href="https://www.linkedin.com/in/aditya-shahh/">Aditya&#8239;Shah</a></strong>, an ML engineer at Google&#8239;DeepMind. Their final system, nicknamed <strong>ARC&#8239;Evolve</strong>, reached an <strong>&#8776;&#8239;80&#8239;% full&#8209;solve rate</strong> on training puzzles, far out&#8209;performing baseline numbers typically reported for large frontier models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AGjj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AGjj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!AGjj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!AGjj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!AGjj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AGjj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15799189,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://words.strongcompute.com/i/166278873?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AGjj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!AGjj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!AGjj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!AGjj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36c3ffed-e0c3-4404-bc8b-c3172d1daaf8_6000x4000.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><strong>Hackathon Winners: Vijay&#8239;raj Gohil and Aditya Shah</strong></figcaption></figure></div><div><hr></div><h3><strong>What is ARC&#8239;AGI&#8239;2?</strong></h3><p>The ARC (Abstraction&#8239;&amp;&#8239;Reasoning Corpus) tasks created by Fran&#231;ois&#8239;Chollet test a model&#8217;s ability to infer symbolic transformations from tiny demonstration sets. ARC&#8239;AGI&#8239;2 raises the bar with new transformation families and a strict &#8220;all&#8209;or&#8209;nothing&#8221; scoring rule: a task is counted only if the model reproduces the entire output grid perfectly. The benchmark has become a proving ground for methods that claim progress toward more general reasoning.</p><div><hr></div><h3><strong>The thought behind the Build</strong></h3><p>In a single paragraph: Vijayraj Gohil and Aditya Shah, together they sketched a strategy they called <strong>&#8220;small data, big search.&#8221;</strong> That ethos shaped every design choice that followed.</p><div><hr></div><h2><strong>Strategy</strong></h2><h3><strong>From One&#8209;Shot RL&#8239;VR to AlphaEvolve&#8209;Style Search</strong></h3><p>Arcified&#8217;s technical recipe fuses two complementary ideas drawn from very recent literature:</p><ul><li><p><strong>One&#8209;Shot Reinforcement Learning with Verifiable Rewards (RL&#8239;VR)</strong>. A paper released in April&#8239;2025 showed that, for reasoning&#8209;heavy datasets, fine&#8209;tuning with just one carefully chosen example plus a binary &#8220;full&#8209;solve&#8221; reward can match or exceed thousand&#8209;sample runs. Arcified used this paradigm to initialize a compact 7&#8209;billion&#8209;parameter language model for ARC&#8239;AGI&#8239;2.</p></li><li><p><strong>AlphaEvolve search</strong>. Google&#8239;DeepMind&#8217;s AlphaEvolve project demonstrated how an LLM&#8209;guided evolutionary loop could rediscover matrix&#8209;multiplication breakthroughs after decades. Arcified adapted the same idea to iteratively refine chains&#8209;of&#8209;thought for ARC puzzles, letting a high&#8209;precision evaluator provide graded feedback between generations.</p></li></ul><p>By combining the two, the team produced a self&#8209;improving loop: RL&#8239;VR delivers an initial policy; AlphaEvolve&#8209;style search mutates that policy&#8217;s reasoning trace until it converges on a stable program that maps input to output.</p><div><hr></div><h3><strong>How It Works&#8212;A Closer Look</strong></h3><ol><li><p><strong>Task taxonomy and sampling<br></strong> ARC&#8239;AGI&#8239;2 examples fall into three geometric regimes:</p><ul><li><p><em>No&#8209;change</em> (input and output are the same size),</p></li><li><p><em>Contraction</em> (output is smaller), and</p></li><li><p><em>Expansion</em> (output is larger).<br></p></li></ul></li><li><p>Arcified built histograms to quantify the prevalence of each regime in the public training set, then repeated the analysis on held&#8209;out evaluation tasks. They discovered that most puzzles clustered in the no&#8209;change and contraction buckets. Using that insight, they curated <strong>ten &#8220;high&#8209;entropy&#8221; samples</strong>&#8212;balanced across regime and across three difficulty bands (easy, medium, hard)&#8212;to act as the sole training pool.<br></p></li><li><p><strong>Group&#8239;Relative Policy Optimisation (GRPO)<br><br></strong> The ten samples were duplicated and permuted to form a synthetic mini&#8209;corpus. GRPO fine&#8209;tuning rewarded only perfect grid matches (1/0 signal), steadily raising the policy&#8217;s success on unchanged&#8209;size puzzles to the mid&#8209;80&#8209;percent range.<br></p></li><li><p><strong>Evolutionary refinement<br><br></strong> Each RL&#8209;generated chain&#8209;of&#8209;thought (CoT) was passed to an evaluator LLM that produced fine&#8209;grained scores on intermediate steps. Those scores fed an evolutionary loop that mutated, recombined, and re&#8209;ranked CoTs, repeatedly boot&#8209;strapping better transforms until the evaluator&#8217;s reward plateaued.<br></p></li><li><p><strong>Deterministic program extraction<br><br></strong> The final CoT was translated into concise, deterministic grid&#8209;manipulation code, ensuring reproducibility for judging.<br></p></li></ol><div><hr></div><h3><strong>Infrastructure Notes</strong></h3><p>They ran Initial experiments on <strong><a href="http://strongcompute.com/">Strong&#8239;Compute Burst Workstations</a></strong>; once tested they scale&#8209;up training on the company&#8217;s <strong>ISC cluster of H100 GPUs</strong>, spun up on demand within minutes. Built&#8209;in hot&#8209;swap utilities and cycling_utils functionality made it straightforward to patch issues without interrupting the 24&#8209;hour clock.</p><div><hr></div><h3><strong>Demo Day</strong></h3><p>During a ten&#8209;minute slot, Arcified presented a <a href="https://docs.google.com/presentation/d/17f3aFA1XEIFqLk9RSQbdeNM7v0Pou0xetJ2CNeMIugo/edit?slide=id.g35a19fdc33b_0_173#slide=id.g35a19fdc33b_0_173">concise slide deck</a>: methodology overview, before&#8209;and&#8209;after solve counts, and a comparison showing their 85&#8239;% success rate next to the single&#8209;digit scores typical of Gemini&#8239;2.5&#8239;Pro and OpenAI&#8239;o3 on the same training samples. Judges highlighted the rigorous data sampling strategy and clear empirical gains.</p><div><hr></div><h3><strong>What Comes Next</strong></h3><p>Arcified&#8239;.AI plans to release their <strong>ARC&#8239;Evolve</strong> code once additional refactoring is complete, extend experiments to larger reasoning models with 300&#8211;400 RL steps, and continue pushing towards a full public entry in the broader ARC Grand&#8239;Prize later this year. They also aim to investigate whether multiple parallel traces of longer chains&#8209;of&#8209;thought yield further gains.</p><div><hr></div><h3><strong>Acknowledgements</strong></h3><p>Vijayraj Gohil and Aditya&#8239;Shah thank <strong>Ben&#8239;Sand, Adam&#8239;Peaston, Tim&#8239;Smoothy, and Rebecca&#8239;Pham</strong> at Strong&#8239;Compute for rapid infrastructure support and guidance throughout the event.</p><p>Github Repo -<a href="https://github.com/vraj130/ArcEvolve"> https://github.com/vraj130/ArcEvolve</a></p><p>Slides -<a href="https://docs.google.com/presentation/d/17f3aFA1XEIFqLk9RSQbdeNM7v0Pou0xetJ2CNeMIugo/edit?usp=sharing">https://docs.google.com/presentation/d/17f3aFA1XEIFqLk9RSQbdeNM7v0Pou0xetJ2CNeMIugo/edit?usp=sharing</a></p>]]></content:encoded></item><item><title><![CDATA[Text-to-Manim: Generating Visual Explanations using GRPO and Gemini Rewards]]></title><description><![CDATA[Automatically converting mathematical questions into visual animations using the Manim animation engine.]]></description><link>https://words.strongcompute.com/p/text-to-manim-generating-visual-explanations</link><guid isPermaLink="false">https://words.strongcompute.com/p/text-to-manim-generating-visual-explanations</guid><dc:creator><![CDATA[Strong Compute]]></dc:creator><pubDate>Mon, 16 Jun 2025 01:38:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Xg_N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xg_N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xg_N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Xg_N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Xg_N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Xg_N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xg_N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:274215,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://words.strongcompute.com/i/166034997?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xg_N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Xg_N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Xg_N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Xg_N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde210696-663a-490e-9c06-059bd2a8131a_1024x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Hackathon Winners: Bernett Orlando, Ramprasadh Kumar and Karthik Ragunath Ananda Kumar</figcaption></figure></div><h3><strong>1. Introduction</strong></h3><p>Our overarching mission is to build a personalized AI tutor capable of delivering high-quality educational content to anyone, anywhere. We believe that true democratization of education can only be achieved by making learning deeply engaging, personalized, and universally accessible. A critical component of this vision is the ability to transform abstract questions into clear, visual explanations &#8212; a method proven to resonate more effectively with the way humans understand complex concepts.</p><p>In this hackathon project, we focused on one specific but essential challenge: <strong>automatically converting mathematical questions into visual animations</strong> using the Manim animation engine. The end goal is to empower students with dynamic visualizations that enhance understanding, retention, and conceptual clarity &#8212; especially in STEM education.</p><div><hr></div><h3><strong>2. Problem Statement</strong></h3><p>Humans are inherently visual learners. Concepts that are difficult to grasp through text-based explanations can often be instantly clarified through animation or visual demonstrations. Despite the power of this modality, creating educational animations remains a time-intensive and highly manual process. Our challenge was to automate this pipeline: given a natural language mathematical question, can we generate a <strong>Manim animation script</strong> that explains the solution visually?</p><div><hr></div><h3><strong>3. Initial Approach: Supervised Fine-Tuning (SFT)</strong></h3><p>We began by attempting a <strong>supervised fine-tuning (SFT)</strong> approach. Specifically, we fine-tuned the DeepSeek LLM using a dataset of input-output pairs, where:</p><ul><li><p>Input = a mathematical question</p></li><li><p>Output = the corresponding Manim script to animate the explanation</p></li></ul><p>We also attempted to incorporate <strong>Chain-of-Thought (CoT)</strong> reasoning in the outputs, guiding the model to not only solve the problem but also break it down into explanatory visual steps.</p><h4><strong>Challenges</strong></h4><p>However, we encountered two major limitations:</p><ol><li><p><strong>Lack of high-quality training data:</strong> Manim-query pairs are a highly niche and scarce dataset. Publicly available examples are limited in volume and diversity.</p></li><li><p><strong>Absence of Chain-of-Thought (CoT) annotations:</strong> Even where datasets exist, few contain intermediate reasoning steps essential for generating coherent explanatory animations.</p></li></ol><p>Due to these challenges, the SFT approach failed to generalize well and lacked visual accuracy and semantic coherence.</p><div><hr></div><h3><strong>4. Proposed Solution: GRPO with Reward Modeling via Gemini as External Judge</strong></h3><p>To address these limitations, we pivoted to a <strong>novel reinforcement learning framework</strong> based on <strong>GRPO (Generative Reward Policy Optimization)</strong>. Instead of relying on static data, we introduced an <strong>external LLM-based reward model</strong> &#8212; built on top of Gemini &#8212; to act as a <strong>judge</strong> of the model&#8217;s outputs. This model provided feedback on the quality of generated animations, enabling us to train the base model using reward signals rather than hardcoded labels.</p><h4><strong>Reward Model Criteria</strong></h4><p>Our reward model evaluated each generated Manim animation based on the following five criteria:</p><ol><li><p><strong>Prompt Consistency<br></strong> <em>Does the animation match the original mathematical prompt in terms of objects involved, actions depicted, and conceptual correctness?<br></em></p></li><li><p><strong>Screen Fit<br></strong> <em>Do the visual elements stay within the canvas boundaries? Do any objects overflow or render off-screen?<br></em></p></li><li><p><strong>Non-overlapping Layout<br></strong> <em>Are the visual elements well-spaced? Do objects overlap in distracting or confusing ways?<br></em></p></li><li><p><strong>Semantic Coherence<br></strong> <em>Does the animation make logical sense? For example, do equations appear where expected? Are objects used in appropriate ways?<br></em></p></li><li><p><strong>Clarity of Explanation<br></strong> <em>Is the final animation pedagogically effective? Would a student find it helpful in understanding the concept?<br></em></p></li></ol><p>These multi-dimensional reward signals allowed us to optimize for visual, spatial, and semantic quality &#8212; aspects that are difficult to enforce via traditional supervised learning.</p><div><hr></div><h3><strong>5. Results and Observations</strong></h3><p>With GRPO and Gemini-based reward modeling, our model demonstrated <strong>significantly better convergence</strong> compared to SFT. Not only did the animations become more visually accurate, but the overall explanatory coherence also improved. The model was able to generalize across a range of simple mathematical prompts and produce clear, legible Manim animations with minimal hallucinations or layout issues.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;f93446d8-aba1-48d6-83d0-cfb130b14d97&quot;,&quot;duration&quot;:null}"></div><p></p><div><hr></div><h3><strong>6. Future Directions</strong></h3><p>This project represents just the beginning of our journey toward building a fully autonomous AI tutor. Moving forward, we plan to:</p><ul><li><p>Expand the complexity and diversity of supported mathematical questions (algebra, calculus, geometry, etc.)</p></li><li><p>Integrate real-time preview and editing tools for generated animations</p></li><li><p>Incorporate user feedback and corrections into the reward signal (RLHF loop)</p></li><li><p>Extend support beyond Manim to other visual engines and modalities (e.g., interactive graphs, 3D geometry)<br></p></li></ul><p>We are excited to continue developing this project with the support of <strong>StrongCompute</strong>, and look forward to pushing the boundaries of personalized AI education.</p><div><hr></div><h3><strong>7. Acknowledgments</strong></h3><p>We thank the hackathon organizers and the community for providing a platform to explore such impactful ideas. We are especially grateful to Strong Compute for providing infrastructure and support.</p><div><hr></div><p>This post was written by Karthik Ragunath Ananda Kumar, AI Researcher @ Tavus Inc, Bernett Orlando, Senior ML SWE @ Google Research and Ramprasadh Kumar, Systems @ NVIDIA</p><p></p><p>Links:</p><ul><li><p>Presentation slides: <a href="https://docs.google.com/presentation/d/1wDSfzwl4mtj5r4oWJa_uXkyfGZJWdCPmQXv4D-Wll4M">Google Slides</a></p></li><li><p>Github repo: <a href="https://github.com/ramprasadhkumar/deepseek-video-gen">Link here</a></p></li></ul><p></p>]]></content:encoded></item><item><title><![CDATA[How ClosedAI Won Strong Compute's ARC AGI2 Hackathon #9: Our Journey]]></title><description><![CDATA[This past weekend, my team, ClosedAI, participated in the ARC AGI2 track of Strong Compute&#8217;s intense 24-hour hackathon&#8212;and we ended up winning!]]></description><link>https://words.strongcompute.com/p/how-closedai-won-strong-computes</link><guid isPermaLink="false">https://words.strongcompute.com/p/how-closedai-won-strong-computes</guid><dc:creator><![CDATA[Strong Compute]]></dc:creator><pubDate>Tue, 06 May 2025 22:57:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!WBT-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WBT-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WBT-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WBT-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WBT-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WBT-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WBT-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:263968,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://words.strongcompute.com/i/163013307?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WBT-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg 424w, https://substackcdn.com/image/fetch/$s_!WBT-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg 848w, https://substackcdn.com/image/fetch/$s_!WBT-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!WBT-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49cd5d79-dd43-418f-8beb-903884bbad84_2048x1152.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Winners: Aman Priyanshu, Sinha, Sanika Chavan, Mudit Sinha</figcaption></figure></div><p></p><p>This past weekend, my team, ClosedAI, participated in the ARC AGI2 track of Strong Compute&#8217;s intense 24-hour hackathon&#8212;and we ended up winning! Here's a detailed look at our approach, the innovations we introduced, and the results we achieved.</p><p><strong>What's ARC-AGI-2?</strong></p><p>The <a href="https://arcprize.org/">ARC-AGI-2</a> benchmark, created by Fran&#231;ois Chollet, consists of 1,000 challenging visual puzzles designed to assess true abstract reasoning in AI. Human participants typically solve around 60% of these puzzles, whereas most existing AI models only manage between 10% and 20%. Each puzzle allows just two submission attempts, demanding high accuracy and generalization from minimal examples.</p><p><strong>Our Strategy</strong></p><p>Given the tight 24-hour constraint, we prioritized maximizing accuracy (pass@2) and computational efficiency. Our team divided the workload into two parallel streams: data augmentation and model architecture. Constant communication and rapid iteration allowed us to promptly resolve issues and share critical insights.</p><p><strong>Our Implementation</strong></p><p><strong>Synthetic Data Generation with LLMs</strong></p><p>We built an automated data generation pipeline using large language models (LLMs). Starting from minimal human-provided examples, we generated hundreds of synthetic puzzle variations per task. These were then filtered and clustered to ensure a diverse and comprehensive training dataset.</p><p><strong>Custom Reasoning Token Blocks</strong></p><p>To make our model&#8217;s reasoning transparent and easily debuggable, we introduced structured "token blocks." Each token block explicitly represented a distinct reasoning step, facilitating rapid error identification and correction.</p><p><strong>The "Less Is More" Architecture (LIMO)</strong></p><p>Inspired by recent research showing the effectiveness of minimal but precise prompts, we employed the LIMO architecture, consisting of:</p><ul><li><p>A primitive encoder converting puzzle grids into structured embeddings.</p></li><li><p>A modular library of fundamental operations (rotate, mirror, count, color-match).</p></li><li><p>A neural scoring mechanism selecting the most plausible operation sequences.</p></li></ul><p><strong>Results &amp; Performance</strong></p><p>Our combined approach achieved a 75% resolution rate on the training puzzles, significantly outperforming the typical AI baseline performance of 10-20%. Each puzzle was solved in less than one second, meeting the competition&#8217;s strict efficiency criteria.</p><p><strong>Infrastructure Utilization</strong></p><p>Leveraging Strong Compute&#8217;s Instant Super Computer (ISC) platform, we rapidly conducted parameter sweeps and experiments across numerous A100 GPUs. Automated end-to-end submission checks ensured quick identification and resolution of issues, maintaining seamless workflow continuity.</p><p><strong>Lessons Learned and Future Directions</strong></p><ul><li><p><strong>Early Automation</strong>: Integrating automated end-to-end tests early on was critical in saving debugging time.</p></li><li><p><strong>Modular Design Advantages</strong>: Our modular and structured reasoning approach consistently outperformed monolithic models in accuracy and interpretability.</p></li></ul><p>Future work will involve open-sourcing our synthetic data generation pipeline and reasoning token blocks, along with exploring meta-learning techniques for automatic reasoning strategy discovery.</p><p><strong>Acknowledgments</strong></p><p>We are grateful to Ben Sand, Adam Peaston, Tim Smoothy, and Rebecca Pham from Strong Compute for their invaluable support and mentorship throughout the event. Their assistance played a significant role in our success.</p><p>Written by Sanika Chavan, Mudit Sinha, Aman Priyanshu</p><p>Github repo link: <a href="https://github.com/sanikac10/Annotating-ARC-AGI-2/tree/main/Annotating-ARC-AGI-2-main">https://github.com/sanikac10/Annotating-ARC-AGI-2/tree/main/Annotating-ARC-AGI-2-main </a></p><div><hr></div><p>Join us for our next ARC Prize Hackathon in SF and Sydney: <a href="https://lu.ma/strongcompute">https://lu.ma/strongcompute </a></p>]]></content:encoded></item><item><title><![CDATA[Strong Compute GPU Hackathon Recap: DeepCertainty: No Hallucinations, Just Results]]></title><description><![CDATA[We&#8217;ve been running GPU hackathons in San Francisco and Sydney to see what happens when you give smart people full access to compute.]]></description><link>https://words.strongcompute.com/p/strong-compute-gpu-hackathon-recap</link><guid isPermaLink="false">https://words.strongcompute.com/p/strong-compute-gpu-hackathon-recap</guid><dc:creator><![CDATA[Strong Compute]]></dc:creator><pubDate>Fri, 28 Mar 2025 03:50:16 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a1e6643c-b613-43cd-a739-dd6215b305fa_1536x1024.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;ve been running GPU hackathons in San Francisco and Sydney to see what happens when you give smart people full access to compute.</p><p>The most exciting projects aren&#8217;t just clever &#8212; they&#8217;re grounded. They tie model output to something you can <em>check</em>. A compile. A benchmark. A math proof. A correct answer, not just a convincing one.</p><p>That&#8217;s a subtle but powerful shift. A lot of machine learning treats model output like a good guess &#8212; probabilistic, fuzzy, often right but not always reproducible. These projects took a different approach: <strong>don&#8217;t just generate something &#8212; generate something you can verify.</strong></p><p>And the difference shows.</p><div><hr></div><p><strong>No Hallucinations</strong></p><p>We&#8217;ve seen a move away from the traditional &#8220;trust the model&#8221; mindset toward something more rigorous: <strong>can we prove this works?</strong></p><p>This is especially important in code generation, scientific reasoning, and anything where correctness matters. When you&#8217;re training or fine-tuning on tasks that involve real-world outcomes &#8212; not just vibes &#8212; you need more than confidence. You need certainty.</p><p>At our March hackathons, we saw CUDA and Math Fine Tunings that show provable deep learning is practical:</p><div><hr></div><p><strong>CUDA Codegen from PyTorch Modules</strong></p><p>One team built a smart transpiler that takes PyTorch modules and converts them into CUDA kernels. The model generates CUDA code and then evaluates each candidate across three dimensions:</p><ul><li><p><strong>Does it compile?</strong></p></li><li><p><strong>Does it produce the correct output?</strong></p></li><li><p><strong>Is it faster than the original?</strong></p></li></ul><p>This is a huge unlock. Because now, instead of relying on token-by-token loss or human labels, you can score the model&#8217;s output <em>based on reality</em>. Compilation success becomes a training signal. Runtime performance becomes a benchmark. And correctness becomes a pass/fail gate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n8rG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n8rG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg 424w, https://substackcdn.com/image/fetch/$s_!n8rG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg 848w, https://substackcdn.com/image/fetch/$s_!n8rG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!n8rG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n8rG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n8rG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg 424w, https://substackcdn.com/image/fetch/$s_!n8rG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg 848w, https://substackcdn.com/image/fetch/$s_!n8rG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!n8rG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa466b977-617f-4855-a6d3-e217823ec753_1600x1200.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Winning team: Robert Zhang, JRH, our CEO, Ben Sand, and Rahman Hajiyev </figcaption></figure></div><p>They used a method inspired by DeepSeek &#8212; sampling multiple CUDA candidates, scoring them relatively, and feeding that back into training via group-relative policy optimization. It&#8217;s reinforcement learning with a feedback loop rooted in physics, not language.</p><p><strong>Results (from Fine Tuning Llama DeepSeek7B on 8x L4s through Strong Compute)</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3h9M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3h9M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png 424w, https://substackcdn.com/image/fetch/$s_!3h9M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png 848w, https://substackcdn.com/image/fetch/$s_!3h9M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png 1272w, https://substackcdn.com/image/fetch/$s_!3h9M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3h9M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png" width="822" height="158" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:158,&quot;width&quot;:822,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17860,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://words.strongcompute.com/i/160041723?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3h9M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png 424w, https://substackcdn.com/image/fetch/$s_!3h9M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png 848w, https://substackcdn.com/image/fetch/$s_!3h9M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png 1272w, https://substackcdn.com/image/fetch/$s_!3h9M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04617b40-2a42-4000-9cbc-814a4e88a5ce_822x158.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Check out the winning team&#8217;s presentation <a href="https://docs.google.com/presentation/d/14G9QH71z-XYlAO_YZ0Xs7mgH8keRca65reWBF1Vwe8M/edit#slide=id.g33b0a5e8485_3_37">here</a>.</p><div><hr></div><p><strong>Mathematical Reasoning with Python Tool-Calling</strong></p><p>Another project focused on mathematical reasoning &#8212; but with a twist.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3k_L!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3k_L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3k_L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3k_L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3k_L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3k_L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3k_L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3k_L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3k_L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3k_L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c645e6d-b3e4-4cc7-817b-d48389078a6c_1600x1200.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Runners up: Karthik Ragunath Ananda Kumar</figcaption></figure></div><p>Rather than having the model do all the work internally (and risking a hallucinated equation), it called out to Python tools mid-inference. For example, it might solve part of a problem itself, then delegate the numerical computation to a verified function.</p><p>This kind of delegation is exciting. It opens the door to integrating with formal verification tools like Lean&#8212; not just solving math problems, but producing <strong>verifiable, explainable</strong> proofs.</p><p>In practice, mathematicians don&#8217;t just want to know <em>if</em> something is true. They want to understand <em>why</em>. The model becomes a co-pilot, helping construct the steps &#8212; not just giving you a binary answer.</p><ul><li><p><a href="https://docs.google.com/presentation/d/1NRduRoailh_MvOQPNf2Dcj9tGKxSq2z6/edit">Check out Karthik and Divya&#8217;s presentation</a></p></li><li><p>GitHub Link For Fine-tuning: <a href="https://github.com/Karthik-Ragunath/isc-demos-karthik/tree/main/deepseek">https://github.com/Karthik-Ragunath/isc-demos-karthik/tree/main/deepseek</a></p></li><li><p>Inference Code: <a href="https://github.com/Karthik-Ragunath/isc-demos-karthik/blob/main/deepseek/inference_consolidated.py">https://github.com/Karthik-Ragunath/isc-demos-karthik/blob/main/deepseek/inference_consolidated.py</a></p></li></ul><div><hr></div><p><strong>Why This Matters</strong></p><p>Verifiable machine learning isn&#8217;t just a niche &#8212; it&#8217;s the direction the field needs to go.</p><p>We&#8217;ve all seen what happens when models are powerful but ungrounded. Outputs that look right but aren&#8217;t. Answers that sound convincing until you test them.</p><p>These projects &#8212; and the teams behind them &#8212; are showing what it looks like to go beyond that. To treat model outputs not as a final product, but as hypotheses. And then build systems that can validate them, at speed.</p><p>We want Strong Compute hackathons to keep pushing in this direction: ideas that are smart <em>and</em> measurable. Tools that show their work. Models that can be trusted <em>because</em> they&#8217;re tested.</p><div><hr></div><p><strong>Join to Hack on ARC Prize or Fine-Tune Deep Seek April 18&#8211;19.</strong></p><p>We&#8217;re bringing the GPUs and the hacker house energy back again.</p><p>Whether you choose to push the frontier on reasoning (ARC Prize) or scale a smarter distillation demo (Deep Seek), we&#8217;ve got clusters, food, desks, and a clean training setup ready for you.</p><p><strong>Previous Winners and Grantees:</strong></p><ul><li><p><strong>PyTorch &#8594; CUDA Fine-Tuning</strong>: Improved translation accuracy from 10% to 30%.</p></li><li><p><strong>ARC Prize</strong>: Our grantee placed 2nd in the 2024 ARC contest.</p></li><li><p><strong>Chess Bots</strong>: Trained from scratch to 2000 ELO in just 10 hours.</p></li></ul><p>For engineers, AI researchers, students &#8212; anyone comfortable with PyTorch.</p><p>We provide the Instant Super Computer (ISC), so you can start training multinode in under an hour. No setup headaches. No fuss.</p><p><em>Engineers only. All code. No slidegineers or recruiters. All applicants vetted for technical fit.</em></p><div><hr></div><p><strong>Competition A: ARC Prize Challenge</strong></p><ul><li><p>Compete to win compute for the 2025 ARC Prize</p></li><li><p>Work on unsolved ARC-AGI-2 tasks with full resources and benchmarks</p></li><li><p>Judged on research rigor, novelty, and benchmark performance</p></li></ul><p><strong>Competition B: Deep Seek Fine-Tuning</strong></p><ul><li><p>Fine-tune DeepSeek-R1 distill variants on your dataset</p></li><li><p>Show what your model can do that the base model can&#8217;t</p></li><li><p>Model sizes: 1.5B to 70B &#8212; all provided</p></li></ul><p><strong>Prize: $2.5K&#8211;$25K Research Compute Grant</strong></p><div><hr></div><p>Let&#8217;s push the frontier &#8212; together.</p><p><a href="https://lu.ma/eptxamdp?utm_source=strongwords">Apply now</a> &#8212; see you April 18-19.</p>]]></content:encoded></item><item><title><![CDATA[Scaling AI Research from 4 to 60+ GPUs: How Strong Compute Enabled InSite's AI for Construction Monitoring.]]></title><description><![CDATA[A Case Study with Insite Project Solutions]]></description><link>https://words.strongcompute.com/p/scaling-ai-research-from-4-to-60</link><guid isPermaLink="false">https://words.strongcompute.com/p/scaling-ai-research-from-4-to-60</guid><dc:creator><![CDATA[Strong Compute]]></dc:creator><pubDate>Wed, 04 Dec 2024 02:38:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><strong>Customer Situation</strong></h2><p><strong>Insite is developing AI monitoring for Construction Projects.</strong></p><p><strong>The development approach used many university research teams (160 developers) to prototype solutions.</strong></p><p><strong>A large scale up of compute management was needed.</strong></p><p>Cian, founder of InSite Project Solutions, needed to manage 160 university students across 26 AI research teams - a massive expansion from previous years. He faced a critical infrastructure challenge. His existing in-house compute setup of 3-4 GPUs couldn't support this scale of concurrent AI development, placing timeline and resource risk on developing computer vision models for construction sites.</p><h2><strong>Project Goals</strong></h2><p>Radical improvement to construction site monitoring through AI-powered, ultra-high-resolution imagery analysis. The solution delivers:</p><ul><li><p>24/7 monitoring with unprecedented detail, capturing site activity up to 800 meters away</p></li><li><p>80% improvement in AI model performance using 64megapixel imagery</p></li><li><p>6-10x cost reduction compared to traditional on-site project planning</p></li><li><p>Real-time analytics and comprehensive reporting for construction managers</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!phSu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!phSu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png 424w, https://substackcdn.com/image/fetch/$s_!phSu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png 848w, https://substackcdn.com/image/fetch/$s_!phSu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png 1272w, https://substackcdn.com/image/fetch/$s_!phSu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!phSu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png" width="1456" height="1205" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1205,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!phSu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png 424w, https://substackcdn.com/image/fetch/$s_!phSu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png 848w, https://substackcdn.com/image/fetch/$s_!phSu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png 1272w, https://substackcdn.com/image/fetch/$s_!phSu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F272ec38d-bec2-460e-9efe-235e263646e5_1600x1324.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Previous Infrastructure Limitations</strong></h2><p>Without the Strong Compute, InSite&#8217;s AI developers faced significant technical hurdles:</p><ul><li><p>Limited GPU availability creating research bottlenecks</p></li><li><p>No job scheduling system, leading to a "first-come, first-served" chaos</p></li><li><p>Resource conflicts with teams frequently hitting "CUDA Out of Memory" errors</p></li><li><p>Performance degradation from concurrent workloads</p></li></ul><h2><strong>The Strong Compute Solution</strong></h2><p>Strong Compute enabled Insite's research capabilities by:</p><ul><li><p>Seamlessly scaling from 4 to 60+ GPUs</p></li><li><p>Eliminating infrastructure management overhead</p></li><li><p>Providing robust job scheduling and resource allocation</p></li><li><p>Enabling true parallel research across 26 teams</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ElZL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ElZL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png 424w, https://substackcdn.com/image/fetch/$s_!ElZL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png 848w, https://substackcdn.com/image/fetch/$s_!ElZL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png 1272w, https://substackcdn.com/image/fetch/$s_!ElZL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ElZL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png" width="1456" height="1069" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1069,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ElZL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png 424w, https://substackcdn.com/image/fetch/$s_!ElZL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png 848w, https://substackcdn.com/image/fetch/$s_!ElZL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png 1272w, https://substackcdn.com/image/fetch/$s_!ElZL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6741acb-43be-4c49-8b5c-ff4870f73a02_1600x1175.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rLN4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rLN4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png 424w, https://substackcdn.com/image/fetch/$s_!rLN4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png 848w, https://substackcdn.com/image/fetch/$s_!rLN4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png 1272w, https://substackcdn.com/image/fetch/$s_!rLN4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rLN4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png" width="1456" height="868" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:868,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rLN4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png 424w, https://substackcdn.com/image/fetch/$s_!rLN4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png 848w, https://substackcdn.com/image/fetch/$s_!rLN4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png 1272w, https://substackcdn.com/image/fetch/$s_!rLN4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bf3631e-94ca-4fb9-aa1c-f8d3f3590d3a_1600x954.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Business Impact</strong></h2><p>"It just worked in the background," says Cian. "I didn't need to manage it. I just set up the accounts and the users went in and used it. I didn't need to get in and fix it if it broke or set it up or whatever. It just worked."</p><p>Strong Compute enabled Insite Project Solutions to:</p><ul><li><p>Successfully manage an 8x growth in AI researchers.</p></li><li><p>Accelerate development of cutting-edge computer vision models</p></li><li><p>Avoid expensive cloud computing costs and complex cloud configurations</p></li><li><p>Focus on innovation instead of infrastructure management</p></li></ul><h2><strong>Why Strong Compute?</strong></h2><p>Strong Compute proved to be the perfect solution for scaling AI research operations:</p><ul><li><p>Zero infrastructure management overhead</p></li><li><p>Immediate access to massive GPU computing power</p></li><li><p>Cost-effective alternative to cloud providers</p></li><li><p>Built-in safeguards against runaway computing costs</p></li><li><p>Seamless onboarding for large research teams</p></li></ul><p>Using Strong Compute, Insite Project Solutions transformed a potential operational nightmare into a seamless research operation. This enabled breakthrough innovations in construction site monitoring while managing a record number of concurrent AI development teams.</p>]]></content:encoded></item><item><title><![CDATA[Inside Our Chess Bot Hackathons and Zero-Code Cluster]]></title><description><![CDATA[A few months ago, we kicked off our AI chess bot hackathons with a big question: How can we make AI training more accessible while showcasing our zero-code cluster management?]]></description><link>https://words.strongcompute.com/p/inside-our-chess-bot-hackathons-and</link><guid isPermaLink="false">https://words.strongcompute.com/p/inside-our-chess-bot-hackathons-and</guid><dc:creator><![CDATA[Strong Compute]]></dc:creator><pubDate>Wed, 30 Oct 2024 03:38:24 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/83af360e-00e4-4006-9b76-036c8185890f_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few months ago, we kicked off our AI chess bot hackathons with a big question: <em>How can we make AI training more accessible while showcasing our zero-code cluster management?</em> </p><p>Inspired to push the boundaries, we decided to build a chess bot in a weekend. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://words.strongcompute.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Strong Words! Subscribe for free to receive new posts and support our work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>What started as an ambitious project has evolved into a proving ground for our capabilities, with 1,100 GPUs across five providers and 40 engineers running simultaneous training workloads.</p><h3><strong>The Power Behind the Hackathons: Our System&#8217;s Capabilities</strong></h3><p>Our system, refined over two years, can handle complex, large-scale workloads seamlessly. Here&#8217;s what sets it apart:</p><ul><li><p><strong>Up to 90GB/sec (720Gbps) on cluster data read speed.</strong></p></li><li><p><strong>Up to 60GB/sec (480Gbps) cloud-to-cloud data transfer</strong></p></li><li><p><strong>Up to 20GB/sec (160Gbps) to a single node for container loads</strong></p></li><li><p><strong>Integrated across 6 cloud providers</strong></p></li><li><p><strong>Scales to support 1,000+ GPUs and 40 developers simultaneously</strong></p></li><li><p><strong>Compatibility with GPUs (H100, A100, A10), scaling from 1 GPU to 16 GPUs per node</strong></p></li><li><p><strong>Infiniband &amp; Ethernet support for high-performance needs</strong></p></li></ul><p>With this setup, developers can scale from a single GPU to a full cluster in just an hour. We introduced Live Billing Systems and Real-Time Cost Controls to keep costs manageable, offering features like per-developer budgets and one-click stop controls.</p><p></p><h3><strong>Recap: Previous Hackathons</strong></h3><h4><strong>Hackathon 1 - Chess vision</strong></h4><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d61065ba-5973-474a-a462-3be2bf6f932b_1080x1620.jpeg&quot;},{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78fa754b-3d68-48e0-bb26-3441e512efef_1620x1080.jpeg&quot;},{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8a13c2d7-1046-455a-a1b8-cec639fcbc16_1620x1080.jpeg&quot;},{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84a50f06-eda4-49a9-ae7e-1f0710735655_1620x1080.jpeg&quot;},{&quot;type&quot;:&quot;image/jpeg&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b29713e-a0d3-487f-896d-c213dea4cdc4_1620x1080.jpeg&quot;}],&quot;caption&quot;:&quot;&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12f4a46a-9a37-4e7f-b864-f1a32ad2162f_1456x1210.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p>Our first cut of the chess hackathon concept formulated the task as a regression problem. What does a human do when deciding which move to make?&nbsp;</p><p>Well, who really knows what <em>humans</em> do, but what <em>we</em> do is consider a handful of potential moves (maybe even all possible moves) and develop a feeling for which are good and which are bad. Then we pick the move that feels like the best one.</p><p>To replicate this process with an AI, we train a neural network to calculate that &#8220;feeling&#8221; as a quantitative score for every potential move, then we sample from the distribution described by those scores to select a move.</p><p>By &#8220;a move&#8221; what we mean is a potential board state that the player could move to, the state of the board at the end of the move. We encode the board as an 8x8 tensor of integers and pass that as input to our neural network to evaluate.</p><p>We also transform the board from being &#8220;white pieces&#8221; and &#8220;black pieces&#8221; to being &#8220;my pieces&#8221; and &#8220;opponent pieces&#8221;, orienting the board accordingly, such that the model is always asked to score the board from the perspective of the player about to move.</p><p>We included two example model architectures suitable for this task in the chess-hackathon repository, a ResNet-based Convolutional Neural Network (CNN) and a Transformer-based model.&nbsp;</p><p>Both model types relied on learned embeddings. In the case of the CNN embeddings were used to convert the 2d tensor of integers into a 3d tensor of floats where the 3rd dimension is analogous to the channels of an image. For the Transformer model, embeddings were additively infused with positional information.</p><p>The strongest models from the first hackathon round were predominantly CNN-based models.</p><h4><strong>Hackathon 2 - ChessGPT</strong></h4><div class="image-gallery-embed" data-attrs="{&quot;gallery&quot;:{&quot;images&quot;:[{&quot;type&quot;:&quot;image/heic&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2dfbda91-b2e9-4a17-9d71-123639dbee6c_5712x4284.heic&quot;},{&quot;type&quot;:&quot;image/heic&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/03c80119-9d25-4a24-bb8c-a633820db0ee_5712x4284.heic&quot;},{&quot;type&quot;:&quot;image/heic&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b37dd79-9bd3-437b-8e66-8755ca5a1197_4032x3024.heic&quot;}],&quot;caption&quot;:&quot;&quot;,&quot;alt&quot;:&quot;&quot;,&quot;staticGalleryImage&quot;:{&quot;type&quot;:&quot;image/png&quot;,&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51e3282c-b542-4acd-ba4f-5fd858a00da4_1456x474.png&quot;}},&quot;isEditorNode&quot;:true}"></div><p></p><p>Throughout the course of the first hackathon we got a lot of questions about LLMs. Can we bring them? Can we use them? Our answer was essentially &#8220;no&#8221;.&nbsp;</p><p>Firstly we had decided that all models must be trained from initialization (from scratch) throughout the course of the hackathon, no pre-trained model weights were allowed. Secondly the task that we had formulated did not seem at all amenable to LLMs. Perhaps this was a failure of imagination, but we also wanted to maximise the likelihood that everyone would be able to submit a functional model.</p><p>In any event we were inspired to look more into the potential to include an LLM track to the chess hackathon. After some searching we discovered the <a href="https://adamkarvonen.github.io/machine_learning/2024/01/03/chess-world-models.html">work of Adam Karvonen</a> which demonstrated that an LLM (of modest size) can be trained from scratch on PGNs (historic chess games recorded in Portable Game Notation) to do next-character prediction in a GPT-like manner and thereby generate the next move to be made in the game.</p><p>We were fascinated by the apparent capability of the Transformer architecture, as shown in Adam&#8217;s work, to learn latent representations of a partially completed game which demonstrably encode details of the board state, the model never having been explicitly shown what a chess board even looks like.</p><p>The second hackathon sought to implement this formulation of the task, training &#8220;ChessGPT&#8221; models to do next-character prediction on a <a href="https://storage.lczero.org/files/training_pgns/test60/">dataset comprising PGNs</a> from recent training runs by Leela Chess Zero.</p><p>Rather than trust the models implicitly to generate valid moves, we generated all possible moves and asked the models to score each with a probability of continuing the game PGN with each.</p><p>One observation worth noting is that the ChessGPT models seemed weak at identifying and exploiting blunders made by their opponents. We speculate this might be due to our choice of training data - PGNs from games played by a highly competent chess engine which contain very few if any serious blunders, hanging a queen for example. The model would therefore consider it very unlikely that a game would continue with a piece moving to take the queen at the particular stage of the game.&nbsp;</p><h4><strong>Hackathon 3 - Vision and ChessGPT</strong></h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-ayj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-ayj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg 424w, https://substackcdn.com/image/fetch/$s_!-ayj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg 848w, https://substackcdn.com/image/fetch/$s_!-ayj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!-ayj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-ayj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg" width="1456" height="1018" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1018,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!-ayj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg 424w, https://substackcdn.com/image/fetch/$s_!-ayj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg 848w, https://substackcdn.com/image/fetch/$s_!-ayj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!-ayj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F819639dc-cea5-476c-b5cb-6e781fcd7101_3024x2114.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>For the third and subsequent hackathons we unified the two formulations attempted for the first two hackathons.&nbsp;</p><p>At each move, models were required to take two inputs - the PGN of the game up to that move, and a short string representing the potential next move in Standard Algebraic Notation (SAN) - and return a score for that potential move.</p><p>ChessGPT models could proceed by appending the potential mode string to the PGN and passing this sequence directly to the Transformer network.</p><p>Vision models were required to convert the PGN and potential move SAN into a representation of the potential board state and score that potential board state.</p><p>The strongest models from this hackathon were predominantly vision-based models, which were markedly more capable of identifying and exploiting blunders, but the strongest model - check out the blog linked below - used interleaved convolutional and self-attention layers.</p><h2><strong>How to win the Chess Hackathon</strong></h2><p>There have been a couple of consistent features of the winning team approaches. We&#8217;ll detail a few of our thoughts below, but you might also like to <a href="https://words.strongcompute.com/p/case-study-how-our-team-won-the-mega">hear from the recent winners</a> themselves how they achieved victory.</p><ol><li><p><strong>Choose a simple model architecture and training approach</strong></p></li></ol><p>The chess-hackathon repository and provided datasets are generally more than enough to work with. If you do want to experiment with a novel architecture, make sure you have spent some time researching that architecture ahead of time and validate that the model input and output tensors are the correct type and shape. If you want to bring your own dataset, spend some time designing and testing your data pipeline ahead of time.</p><ol start="2"><li><p><strong>Validate your model early</strong></p></li></ol><p>Your model might be the strongest chess AI the world has ever seen, but if it takes a whole cluster of compute and an hour to make a move (or if we can&#8217;t run it for some other reason) then we just won&#8217;t let it play, and a surefire way <em>not</em> to win is to not be allowed to compete.</p><p>We publish a validation script with the chess-hackathon repository that checks your model meets our tournament specifications. Before you even launch your model to train, generate a checkpoint for your model and validate that it will pass our pre-flight check.&nbsp;</p><p>We also publish super detailed instructions on how to develop your model so that it meets our compatibility requirements, so pay close attention to those and set your project up to be compatible from the beginning.</p><ol start="3"><li><p><strong>Start training early and train for as long as possible</strong></p></li></ol><p>Deep learning models take time to train, you are likely to run out of cluster time before your model stops improving in training. The winning teams have consistently been those whose models were able to train for many hours. Start training early, and train for as long as you can.</p><p>You might be wondering, what will I do with all the time while I wait for my model to train? Here are some suggestions.</p><p>Firstly, always be recovering your checkpoints and evaluating your models. Evaluating models is tricky when your training and target objectives are so loosely connected. How do I know if my model is good at chess? How does anyone know they&#8217;re good at chess? Play them off and see which one wins. Play against them yourself.</p><p>Secondly, be prepared for your training run to fail at some point. This might happen due to a hardware failure on the cluster you&#8217;re training on, or an internet or power outage. Interruptions are an inevitable fact of life when you&#8217;re training on hundreds of GPUs at a time. When your training is interrupted, you&#8217;re going to want to recover the latest checkpoint and start training again.</p><h3><strong>Looking Ahead: The Next Mega Chess Hackathon</strong></h3><p>We&#8217;ve heard the feedback that a weekend may not be enough time to dive deep.&nbsp;</p><p>That&#8217;s why we&#8217;re opening up early access for <a href="https://lu.ma/strongcompute">our next event.&nbsp;</a></p><p>Participants can join virtually a week ahead for onboarding, system access, and experiment credits. Then, the hackathon weekend will open with burst access in San Francisco and Sydney.</p><p>Our next Mega Chess Hackathon promises to be our biggest yet. You&#8217;ll have the chance to leverage powerful tools, experiment with advanced models, and test your AI chess skills.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vPSH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vPSH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vPSH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vPSH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vPSH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vPSH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:403236,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vPSH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vPSH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vPSH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vPSH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39b876da-d6e6-4552-80f6-380296347656_1620x1080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://words.strongcompute.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Strong Words! Subscribe for free to receive new posts and support our work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Case Study: How Our Team Won the Mega Chess Hackathon with Deep Learning and Rapid Iteration]]></title><description><![CDATA[Our team recently competed in and won the Strong Compute Mega Chess Hackathon. Ten San Francisco and Sydney teams competed simultaneously to build the strongest possible chess-playing deep learning model in just two days. The event culminated in a model vs. model tournament, where the bots faced off to determine the final winner. It was an exciting and challenging experience. We would like to share some of the lessons learned along the way.]]></description><link>https://words.strongcompute.com/p/case-study-how-our-team-won-the-mega</link><guid isPermaLink="false">https://words.strongcompute.com/p/case-study-how-our-team-won-the-mega</guid><dc:creator><![CDATA[Strong Compute]]></dc:creator><pubDate>Tue, 24 Sep 2024 18:00:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Czik!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Czik!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Czik!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Czik!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Czik!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Czik!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Czik!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg" width="1456" height="1018" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1018,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Czik!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Czik!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Czik!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Czik!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff486637e-2bb8-4cbc-a748-57de2f65e640_3024x2114.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Our team recently competed in and won the Strong Compute <a href="https://lu.ma/strongcompute">Mega Chess Hackathon</a>. Ten San Francisco and Sydney teams competed simultaneously to build the strongest possible chess-playing deep learning model in just two days. The event culminated in a model vs. model tournament, where the bots faced off to determine the final winner. It was an exciting and challenging experience. We would like to share some of the lessons learned along the way.&nbsp;</p><h2><strong>Team Formation and Strategy</strong></h2><p>Our team comprised <a href="https://www.linkedin.com/feed/">Justin F. Knoll</a>, <a href="https://www.linkedin.com/in/suryaprakash360/">Suryaprakash Senthil Kumar</a>, and <a href="https://www.linkedin.com/in/ashishmukharji/">Ashish Mukharji</a><strong>. </strong>We did not know each other before the event but made a point to connect via Zoom and share our backgrounds, possible technical approaches, working styles, and goals for the event beforehand. We were confident in our team and approach before the event started. Forming a team and getting a rough consensus on our approach before the event started saved us precious hacking time and is a highly recommended tactic.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://words.strongcompute.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Strong Words! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Once the event began, we dove in to familiarize ourselves with the Strong Compute ISC platform, the provided datasets and models, and the actual tournament gameplay example scripts.</p><p>Our first objective was to close the loop: to train a very basic model from randomized weights into a candidate model competing in a one-round mock tournament. It&#8217;s hard to overstate how valuable this was in ensuring we understood all parts of the stack, the submission requirements, and the tournament API.</p><p>We started a multi-hour training run and pulled one of the intermediate checkpoints to close the loop. Seeing even a very weak and undertrained model playing chess on a live-refreshing board was a magical moment! The gameplay test script gave us a way to evaluate models against each other heuristically.</p><p>We let the model train and monitored training loss, rank correlation, adaptive learning rate adjustments, etc. to gauge training performance.</p><h2><strong>Exploring a Range of Technical Approaches</strong></h2><p>Confident that we understood the full stack and submission requirements, and with a way to approximately evaluate model performance, we turned our attention to selecting our own moves as a team within the tournament.</p><p>Given the complexity of building a competitive chess-playing model, we explored two high-level approaches: a vision-like model and a GPT-based model. One key inspiration for the GPT-based models was <strong>Adam Karvonen&#8217;s paper</strong> on &#8220;Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models.&#8221; Within the vision approach, we experimented with CNNs, transformers, and multi-head attention.</p><p>Adam's mechanistic interpretability research on Chess-GPT models applied linear probes to analyze Chess-GPT network activations and concluded that the network creates an emergent world model including the chessboard, piece positions, and even latent variables like player ELO rating. Learning about this research was a fascinating side quest, but the mechanistic interpretability results were not ultimately about model performance, but about emergent world models.</p><p>At times, we worried about training a GPT-based model based on Leela Chess Zero self-play, since such a model is ultimately doing probabilistic next-token prediction over examples from the training corpus, and one presumes that some of the Leela Chess Zero self-play games from early training would be examples of spectacularly poor play! On the other hand, if one were to train a GPT-based model on only grandmaster games, it would never have seen the sort of blunders we expected to encounter in the tournament models, and so wouldn&#8217;t know how to exploit them. In general, the GPT-based chess models are fascinating but seemed harder to reason about how to train for high performance.</p><p>We did some initial hyperparameter tuning of the models by modifying values and observing training metrics over short &#8220;cycle&#8221; test runs, then using the selected parameters for longer &#8220;burst&#8221; runs. We stopped short of doing a more structured hyperparameter sweep.</p><h2><strong>Team-Level Divide and Conquer</strong></h2><p>We were able to divide and conquer by having individual team members focus on optimizing different model approaches and run long-lived &#8220;burst&#8221; jobs to train those models in parallel. This was key to parallelizing progress and maximizing how quickly we could validate &#8212; or discard &#8212; our hypotheses.</p><p>We had access to multi-gigabyte datasets, including historical grandmaster games and Leela Chess Zero self-play. We experimented with merging some of the provided datasets, which was not difficult and seemed effective. We experimented with adding our own outside data sets (for example, a 29GB <a href="https://lichess.org/">Lichess</a> database export), but time constraints forced us to focus more on model tuning and training on the provided datasets than on converting and ingesting outside data.</p><h2><strong>Leveraging Strong Compute's ISC for Training</strong></h2><p>All teams were provided with access to Strong Compute's service, which made it possible for us to train our models using powerful 72xA100 clusters. This infrastructure was a game-changer for rapid iteration.</p><h2><strong>The Winning Model</strong></h2><p>On day two, we selected some of the longest-trained models with the best training metrics and played them against each other using the gameplay script, keeping an informal tally of performance. The vision approaches were dominant, so we did some final tuning and played a sub-tournament amongst the strongest two vision models, ultimately selecting a CNN-based model with multi-head attention and dilated convolution to expand the receptive field and capture relationships further across the board.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WByS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WByS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png 424w, https://substackcdn.com/image/fetch/$s_!WByS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png 848w, https://substackcdn.com/image/fetch/$s_!WByS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png 1272w, https://substackcdn.com/image/fetch/$s_!WByS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WByS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png" width="1456" height="829" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:829,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!WByS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png 424w, https://substackcdn.com/image/fetch/$s_!WByS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png 848w, https://substackcdn.com/image/fetch/$s_!WByS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png 1272w, https://substackcdn.com/image/fetch/$s_!WByS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c43214-b402-43cf-8392-4ed31b28c06b_1600x911.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Post-Hackathon Reflection</strong></h2><p>Participating in this hackathon was a valuable learning experience. Two days is not much time to build and coordinate as a team, and we didn&#8217;t get a chance to implement many of our ideas. Just as with building a commercial product, we had to strike a balance between speed and rigor and aim to maximize the rate of validated learning.</p><p>While overall time was scarce, the ability to easily do distributed training on a 72xA100 cluster was a game changer. More data, more epochs, deeper models: all of these were feasible. Ensuring that we were always using the cluster for some experiment and not letting it idle was an important tactic.</p><h2><strong>Acknowledgments</strong></h2><p>A huge thanks to my teammates and everyone who made this event possible, especially <strong>Ben Sand, Adam Peaston, Tim Smoothy, and Rebecca Pham from Strong Compute</strong>, for providing the infrastructure and support that allowed us to compete at this level.</p><h2><strong>Conclusion</strong></h2><p>This hackathon pushed our limits and taught us the value of rapid iteration, strategic model selection, and leveraging powerful computing resources. We&#8217;re thrilled with our success and glad to be able to share the lessons here.</p><p>This was written by Chess Hackathon participants, Ashish Mukharji, Justin F. Knoll and Suryaprakash Senthil Kumar.</p><p>Try Strong Compute at our next<a href="https://lu.ma/strongcompute"> event</a>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://words.strongcompute.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Strong Words! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>