<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Balzabu | Blog</title>
    <link>https://blog.balzabu.io/tags/cursor/</link>
    <description>Recent content on Balzabu | Blog</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Wed, 29 Jan 2025 21:02:13 +0200</lastBuildDate>
    <atom:link href="https://blog.balzabu.io/tags/cursor/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Ollama &#43; Continue: Open-Source is Always Better!</title>
      <link>https://blog.balzabu.io/posts/deepseek-ollama-continue/</link>
      <pubDate>Wed, 29 Jan 2025 21:02:13 +0200</pubDate>
      <guid>https://blog.balzabu.io/posts/deepseek-ollama-continue/</guid>
      <description>&lt;hr&gt;
&lt;p&gt;In a previous post, I discussed setting up a local environment to run language models on your own machine (&lt;a href=&#34;https://blog.balzabu.io/posts/lmstudio_continue/&#34;&gt;Read the original post&lt;/a&gt;).
&lt;br&gt;Well, I have to admit today that I’ve concluded that while LMStudio has its merits, it doesn’t fully align with my principles. Consequently, I’ve decided to transition to &lt;strong&gt;Ollama&lt;/strong&gt; and refresh my setup to embrace the recent hype surrounding the Chinese model trained with synthetic data based on ChatGPT responses. &lt;em&gt;(FTW: Work smarter, not harder! :D)&lt;/em&gt;&lt;/p&gt;</description>
      <content:encoded><![CDATA[<hr>
<p>In a previous post, I discussed setting up a local environment to run language models on your own machine (<a href="https://blog.balzabu.io/posts/lmstudio_continue/">Read the original post</a>).
<br>Well, I have to admit today that I’ve concluded that while LMStudio has its merits, it doesn’t fully align with my principles. Consequently, I’ve decided to transition to <strong>Ollama</strong> and refresh my setup to embrace the recent hype surrounding the Chinese model trained with synthetic data based on ChatGPT responses. <em>(FTW: Work smarter, not harder! :D)</em></p>
<img src="../../images/ollama_continue_deepseekr1/sad_realization.jpg" title="ollama_pull" alt="sad_realization">
<hr>
<h2 id="installing-ollama">Installing Ollama</h2>
<p>Installing Ollama is straightforward. While compiling the project locally might be challenging for some, pre-built release builds are available on the official website: <a href="https://ollama.com">ollama.com</a>. Once downloaded and installed, you can use the command-line interface (CLI) to download and run models easily.</p>
<h2 id="available-deepseek-r1-models">Available DeepSeek R1 Models</h2>
<p>Here is a table summarizing the currently available <strong>DeepSeek R1</strong> models:</p>
<table>
  <thead>
      <tr>
          <th>Name</th>
          <th>Size</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>deepseek-r1:1.5b</strong></td>
          <td>1.1 GB</td>
      </tr>
      <tr>
          <td><strong>deepseek-r1:7b</strong></td>
          <td>4.7 GB</td>
      </tr>
      <tr>
          <td><strong>deepseek-r1:8b</strong></td>
          <td>4.9 GB</td>
      </tr>
      <tr>
          <td><strong>deepseek-r1:14b</strong></td>
          <td>~ 9 GB</td>
      </tr>
      <tr>
          <td><strong>deepseek-r1:32b</strong></td>
          <td>~ 20 GB</td>
      </tr>
      <tr>
          <td><strong>deepseek-r1:70b</strong></td>
          <td>~ 43 GB</td>
      </tr>
      <tr>
          <td><strong>deepseek-r1:671b</strong></td>
          <td>~ 404 GB</td>
      </tr>
  </tbody>
</table>
<h2 id="selecting-the-right-model">Selecting the Right Model</h2>
<p>Selecting the appropriate model depends on your hardware. Here’s a table to help you decide:</p>
<table>
  <thead>
      <tr>
          <th>Parameters</th>
          <th>RAM</th>
          <th>VRAM</th>
          <th>Use Case</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>1.5B</strong></td>
          <td>~4 GB</td>
          <td>~3.5 GB</td>
          <td>Simple tasks on modest PCs</td>
      </tr>
      <tr>
          <td><strong>7B</strong></td>
          <td>~8–10 GB</td>
          <td>~8 GB</td>
          <td>Intermediate tasks</td>
      </tr>
      <tr>
          <td><strong>14B</strong></td>
          <td>~16 GB</td>
          <td>~12 GB</td>
          <td>Advanced tasks</td>
      </tr>
      <tr>
          <td><strong>70B</strong></td>
          <td>~40 GB</td>
          <td>~40 GB</td>
          <td>Complex tasks on powerful PCs</td>
      </tr>
      <tr>
          <td><strong>671B</strong></td>
          <td>~1,342 GB</td>
          <td>~1,342 GB</td>
          <td>Highly specialized tasks requiring extensive computational resources</td>
      </tr>
  </tbody>
</table>
<p><strong>Notes:</strong></p>
<ul>
<li>
<p>The VRAM requirements are approximate and can vary based on specific configurations and quantization techniques.</p>
</li>
<li>
<p>Quantization methods can reduce VRAM usage. For instance, a 1.58-bit quantized version of the DeepSeek-R1 model can fit into 160 GB of VRAM, allowing it to run on two NVIDIA H100 80GB GPUs. (<a href="https://unsloth.ai/blog/deepseekr1-dynamic">unsloth.ai</a>)</p>
</li>
<li>
<p>For CPU-based inference without a GPU, it&rsquo;s possible to run certain quantized versions of the model with as little as 20 GB of RAM, though performance may be slower. (<a href="https://unsloth.ai/blog/deepseekr1-dynamic">unsloth.ai</a>)</p>
</li>
</ul>
<p>When selecting a model, ensure that your hardware meets the necessary requirements to achieve optimal performance.</p>
<h2 id="downloading-and-running-the-model">Downloading and Running the Model</h2>
<p>To download and run a model using <strong>Ollama</strong>, follow these steps:</p>
<ol>
<li>
<p><strong>Start Ollama</strong>: After installation, Ollama will run in the background without displaying any visible interface.</p>
</li>
<li>
<p><strong>Download the Model</strong>: Open a terminal and execute the following command to download the desired model:</p>






<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>ollama pull deepseek-r1:7b</span></span></code></pre></div>
<p>This command will download the latest version of the <code>deepseek-r1:7b</code> model.</p>
<img src="../../images/ollama_continue_deepseekr1/ollama_pull.png" title="ollama_pull" alt="ollama_pull">
</li>
<li>
<p><strong>Serve the Model</strong>: Once downloaded, run the following command to expose the model on your local machine:</p>






<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>ollama serve</span></span></code></pre></div>
<p>By default, the model will be accessible at <code>http://localhost:11434</code>.
Additionally, after pulling a new model,the server will automatically restart to apply the changes, meaning that it will start automatically the first time we pull!</p>
</li>
</ol>
<hr>
<h2 id="models-i-downloaded">Models I Downloaded</h2>
<p>I also downloaded a few additional models to expand my local setup. Here’s the list of models I have installed on Ollama:</p>






<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>ollama ls</span></span></code></pre></div>
<table>
  <thead>
      <tr>
          <th>Model Name</th>
          <th>ID</th>
          <th>Size</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><strong>llama3.2:3b</strong></td>
          <td>a80c4f17acd5</td>
          <td>2.0 GB</td>
      </tr>
      <tr>
          <td><strong>nomic-embed-text:latest</strong></td>
          <td>0a109f422b47</td>
          <td>274 MB</td>
      </tr>
      <tr>
          <td><strong>qwen2.5-coder:3b</strong></td>
          <td>e7149271c296</td>
          <td>1.9 GB</td>
      </tr>
      <tr>
          <td><strong>deepseek-r1:7b</strong></td>
          <td>0a8c26691023</td>
          <td>4.7 GB</td>
      </tr>
  </tbody>
</table>
<img src="../../images/ollama_continue_deepseekr1/ollama_ls.png" title="ollama_ls" alt="ollama_ls">
<hr>
<h2 id="integrating-with-continue-extension">Integrating with Continue Extension</h2>
<p>To integrate the model with your development environment, you can use the <a href="https://marketplace.visualstudio.com/items?itemName=Continue.continue">Continue</a> extension. After installation, update the <code>config.json</code> file to include the models you’ve downloaded.</p>
<h3 id="path-to-configjson">Path to <code>config.json</code></h3>
<ul>
<li><strong>macOS and Linux</strong>: <code>~/.continue/config.json</code></li>
<li><strong>Windows</strong>: <code>%USERPROFILE%\.continue\config.json</code></li>
</ul>
<h3 id="sample-configjson">Sample <code>config.json</code></h3>
<p>Here’s an example of what the configuration file might look like:</p>






<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;allowAnonymousTelemetry&#34;</span>: <span style="color:#66d9ef">false</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;models&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;title&#34;</span>: <span style="color:#e6db74">&#34;DeepSeek-R1 7B&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;provider&#34;</span>: <span style="color:#e6db74">&#34;ollama&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;model&#34;</span>: <span style="color:#e6db74">&#34;deepseek-r1:7b&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;title&#34;</span>: <span style="color:#e6db74">&#34;LLAMA 3.2B&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;provider&#34;</span>: <span style="color:#e6db74">&#34;ollama&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;model&#34;</span>: <span style="color:#e6db74">&#34;llama3.2:3b&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  ],
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;tabAutocompleteModel&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;title&#34;</span>: <span style="color:#e6db74">&#34;Qwen2.5-Coder 3B&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;provider&#34;</span>: <span style="color:#e6db74">&#34;ollama&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;model&#34;</span>: <span style="color:#e6db74">&#34;qwen2.5-coder:3b&#34;</span>
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;embeddingsProvider&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;provider&#34;</span>: <span style="color:#e6db74">&#34;ollama&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;model&#34;</span>: <span style="color:#e6db74">&#34;nomic-embed-text&#34;</span>
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;contextProviders&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;code&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;params&#34;</span>: {}
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;docs&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;params&#34;</span>: {}
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;diff&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;params&#34;</span>: {}
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;terminal&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;params&#34;</span>: {}
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;problems&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;params&#34;</span>: {}
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;folder&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;params&#34;</span>: {}
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;codebase&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;params&#34;</span>: {}
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  ],
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;slashCommands&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;share&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;description&#34;</span>: <span style="color:#e6db74">&#34;Export the current chat session to markdown&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;cmd&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;description&#34;</span>: <span style="color:#e6db74">&#34;Generate a shell command&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;commit&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;description&#34;</span>: <span style="color:#e6db74">&#34;Generate a git commit message&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  ]
</span></span><span style="display:flex;"><span>}</span></span></code></pre></div>
<p>Save the file, and you’re ready to generate code!</p>
<img src="../../images/ollama_continue_deepseekr1/ollama_vscode.png" title="ollama_vscode" alt="ollama_vscode">
<p><strong>Note</strong>: If you really care like me, don&rsquo;t forget to disable the Anonymous Telemetry (For additional details: <a href="https://docs.continue.dev/telemetry">docs.continue.dev</a>); also it&rsquo;s worth nothing that Ollama adds itself to the startup processes by default&hellip; if you prefer to prevent this behaviour, <strong>make sure to disable it</strong>.</p>
<p>Happy hacking!</p>
<hr>
<h2 id="contacts">Contacts</h2>
<p>For questions or suggestions, contact: <a href="mailto:noc@balzabu.io">noc@balzabu.io</a>.</p>
]]></content:encoded>
    </item>
  </channel>
</rss>
