Markdown (incl. Gemini Chat Responses) to Blogger Blog Post HTML - Procedures and Tips

Jump Links


VS Code NBSP Find and Replace #

20:59, 13 April 2026

From Gem chat: VS Code NBSP Find and Replace

Me: When I copy pasted a message from one of our chats to VS Code, what seemed to me to be a space got copied over as NBSP character which showed as a space but was not being matched in a find with space.
...

[After getting soln.]

Me: That's it. Now it shows me a border around it. Hmm. I think this has changed in recent VS Code updates. Previously I would see it without the need for this setting.

Please give me a short message on this issue and fix which I will add to my blog.

Gem: It is very likely that a recent update tightened the default "Unicode Highlight" logic to reduce visual noise, moving the NBSP (Non-Breaking Space) into the "Non-Basic ASCII" category which defaults to being hidden in trusted workspaces.

Here is a concise draft you can use for your blog.


Fixing "Invisible" NBSP Character Issues in VS Code

When copying code or text from web-based chat applications into VS Code, you may occasionally bring over a Non-Breaking Space (NBSP) ($U+00A0$) instead of a standard space ($U+0020$). These characters look identical but can break searches, scripts, and compilers.

The Problem

Even with Render Whitespace enabled, VS Code may display an NBSP as a regular space without any warning. This often happens because modern VS Code versions categorize the NBSP as "Non-Basic ASCII," and the safety highlighter is disabled by default for "Trusted Workspaces."

The Fix

To ensure VS Code always highlights these "imposter" spaces with a visible border and a hover-tooltip:

  1. Open Settings (Ctrl + ,).
  2. Search for Editor > Unicode Highlight: Non Basic ASCII.
  3. Change the setting from inUntrustedWorkspace to true.
  4. (Optional) Ensure Editor > Unicode Highlight: Ambiguous Characters is also checked.

Once enabled, any NBSP character will be surrounded by a thin border, allowing you to quickly identify and replace it before it causes "ghost" bugs in your code.

===============

Removing id attribute from header tags using regex in VS Code #

13:18, 16 April 2026

[20 Apr 2026 Update: The ids in the long .html file mentioned below were inserted by markdowntohtml.com which I use to convert Gemini chat responses from markdown to html. The site does not seem to have an option to disable id attributes being added. end-update.]

Me: I have a long .html file open in VS Code. It has entries like:
<h3 id="1-the-native-title-attribute-the-basics-">

I want to change them to:
<h3>
----

i.e. I want to remove the id attribute in these h3 tags. Is there a regular expression find (and replace) which I can use which will match all such h3 tags?

Gem: In VS Code, you can use a regular expression to target these specific h3 tags. By using the "Find and Replace" tool (Ctrl + H or Cmd + Alt + F on Mac), you can strip the id attributes instantly.

The Find and Replace Pattern

To match the id attribute and its contents while keeping the opening <h3> tag intact, use the following:

  • Find: <h3\s+id="[^"]*">
  • Replace: <h3>

How the Regex Works

SymbolPurpose
<h3Matches the literal opening of the tag.
\s+Matches one or more whitespace characters (the space after "h3").
id="Matches the start of the ID attribute.
[^"]*negated character set that matches everything until it hits the closing quote.
">Matches the closing quote and the end of the tag.

Steps in VS Code

  1. Open the Replace widget (Ctrl + H).
  2. Click the Use Regular Expression icon (the .* button) or press Alt + R.
  3. Enter the Find and Replace values as shown above.
  4. Click Replace All (the icon with two overlapping squares).

Pro Tip: If your IDs contain unusual characters like escaped quotes (uncommon in standard HTML but possible), the [^"]* logic is the safest way to ensure the regex doesn't "over-eat" and accidentally delete parts of the following tags.

That worked well.
It worked for h4 element too.
So this is a nice way to strip ids from  h2, h3 and h4 headers that are typically added by Gemini web chat markdown copy-paste which markdown to html converts to HTML headers with ids.
  • Optionally, in VS Code,  class="no-toc" can be applied to all configured header elements using my VS Code extension command:
    • Auto Header IDs and TOC: Mark Configured Headers as no-toc
  • h2 and h3 headers added manually after the above step (without no-toc class) can have ids assigned to them by the extension command: 
    • Auto Header IDs and TOC: Add IDs to Headers
  • Table of Contents can be generated with extension command: 
    • Auto Header IDs and TOC: Create Table of Contents

Changing h3 tags to h2 excluding h3 tags followed by specific class using regex in VS Code #

Me: [I had added h3 headers instead of h2 and so had to use below find-replace]
I have h3 headers of this pattern:
<h3>HTML Formatting Elements</h3>
which I want to be changed to h2 headers
<h2>HTML Formatting Elements</h2>
But I want following pattern h3 headers not to be changed:
<h3 class="no-toc">1. Physical vs. Semantic Tags</h3>
Can we do this using Regular expression find replace in VS Code?

Gem: Yes, you can do this by using a negative lookahead. This allows you to tell VS Code: "Find an <h3> tag, but only if it is not followed by a specific class."

The Find and Replace Pattern

Open the Replace widget (Ctrl + H) and ensure the Regex mode (.* icon) is active.

  • Find: <h3(?!\s+class="no-toc")>(.*?)</h3>
  • Replace: <h2>$1</h2>

Breakdown of the Logic

PartMeaning
<h3Matches the literal start of the tag.
(?!\s+class="no-toc")Negative Lookahead: Ensures the very next characters are NOT a space followed by class="no-toc". If that class is present, the match fails.
>Matches the closing bracket of the opening tag.
(.*?)Capture Group: Matches the text inside the header. The ? makes it "lazy" so it stops at the first </h3> it finds.
</h3>Matches the literal closing tag.
$1In the replace field, this puts back whatever text was captured inside the original brackets.

Important Considerations

  • Class Order: If your "no-toc" headers sometimes have other attributes (e.g., <h3 id="top" class="no-toc">), the regex above will still match and change them because it's looking for class="no-toc" immediately after the h3.
  • Global Classes: If you have multiple different classes you want to exclude, you can adjust the lookahead, but for your specific request regarding the no-toc pattern, the logic above is the most direct solution.
That worked well.

Note that VS Code Extension page for my extension Settings link took me to :
@ext:Ravi S. Iyer.auto-header-ids-toc in Settings.
That was showing nothing.
I got confused. I thought default will apply (of 1 and 2)

Changing Settings to auto-header-ids-toc
showed Headers To Process list as having:
1
2
3
-----

That's why when I used the "Auto Header IDs and TOC: Mark Configured Headers as no-toc" command it applied it to h3 headers too.

SOP for Gemini chat markdown conversion to html for Blogger post #

  1. In VS Code, use following to remove unwanted ids from Gem chat for h2, h3 and h4 (in converted html)
    • Find: <h3\s+id="[^"]*">
    • Replace: <h3>
  2. Run "Auto Header IDs and TOC: Mark Configured Headers as no-toc" command
    • Note Settings -> auto-header-ids-toc:Headers To Process typically is set to: 1 2 3
    • So h1, h2 and h3 are configured headers.
    • Note that default for auto-header-ids-toc:Headers To Process is 1 2
  3. Add my own sections (whose internal links will be created and added in TOC) as h2 headers without id attributes.
  4. Run "Auto Header IDs and TOC: Add IDs to Headers" command
  5. Position cursor where TOC has to be inserted. Next run "Auto Header IDs and TOC: Create Table of Contents" command
  6. In TOC remove top unwanted <ul>. Added <hr> at end of TOC.
  7. Check html rendering in browser.
  8. Copy html content to Blogger post. Remove h1 header from top of post content, if present.
  9. Check post in Blogger Compose. If OK, publish post and then check its rendering and TOC links.
I used above SOP for a three post series whose first part is: HTML Refresher and Detailed Study - Part 1 and which links to other two parts.
------------

Comments