HTML Refresher and Detailed Study - Part 4
Contents
Data attributes
- Data attributes in HTML
- Some React libraries use data attributes
- W3Schools HTML tutorial has limited coverage of data attributes
- MDN guide page for data attributes is recommended by Gemini
- Dynamically generated HTML elements may be more easily targetted via data attributes
- Backend server populates elements with data attributes to identify them
- State tracking using data attributes
- Passing small bits of configuration using data attributes
- Some React libraries using data attributes to identify elements it interacts with
- React Styling libraries like Headless UI using data attributes to signal state to CSS
- React library using data attributes for metadata avoiding prop drilling
- Data attributes store data in DOM itself in contrast to JavaScript variables
- Data attributes are more easily visible in Chrome Inspector/DevTools as compared to JavaScript variables
- Small data-attributes.html sample test file
XSS security vulnerability of innerHTML; textContent is safe
- textContent needs newline escape character and CSS white-space pre-line for line breaks
- XSS security vulnerability of innerHTML
- innerHTML security flagging is a problem. textContent avoids that
- Use textContent with specific CSS instead of innerHTML for multi-line content to avoid security flagging
Misc: Notifications API, IndexedDB, Cookies, Authentication, Fetch API
- HTML Notifications API are for system notifications and so useful only when combined with service worker
- IndexedDB
- Video: IndexedDB - What is it, and when you should choose it
- Official Google Video: IndexedDB - Progressive Web App Training
- Cookies rarely used by App Dev nowadays but HttpOnly Cookie used for better security for tokens like JWT
- Google services access: API Key and OAuth 2.0 / OpenID
- Plain JS libraries for cookies
- Header-based Authentication common for React SPAs
- More detailed: Cookies rarely used at App dev level now. Libraries use HttpOnly cookie for JWT as it is more secure
- Fetch API Refresher Using Modern Async Await for Promises
- In React/Nextjs apps I typically used Axios library for fetch type operations
- Handling Fetch Promises: Modern Async/Await is preferred over Legacy .then() Chaining
- Promise.all can be used with Async/Await; Avoiding overloading mind with .then() promise chaining syntax details
Gemini chat: 4-HTML Refresher n Study
Date: ~23 to 24 April 2026
Note that this topic was discussed in bursts and not continuously in the chat. So related parts are dispersed through the chat.
Data attributes in HTML
Me: Tell me about data attributes in HTML. How often are they used in Plain JS projects? How often are they used in React projects?
Gem: HTML data attributes (often called
data-* attributes) are a standard way to store extra information
directly on HTML elements without using non-standard attributes or polluting
the class list.
Any attribute starting with data- is treated as a data attribute.
For example:
<div data-user-id="123"
data-status="active"></div>.
1. How They Work
Data attributes allow you to embed custom data that can be accessed via CSS or
JavaScript. In JS, they are available through the
dataset property.
Accessing Data in JavaScript
const element = document.querySelector('div');
// Reading
const userId = element.dataset.userId; // Converts kebab-case to camelCase
console.log(userId); // "123"
// Writing
element.dataset.status = 'inactive';
Accessing Data in CSS
You can style elements based on their data attributes using attribute selectors:
div[data-status="active"] {
border: 2px solid green;
}
2. Usage in Plain JS Projects
In "Vanilla" or Plain JS projects, data attributes are extremely common and often essential. Since you don't have a sophisticated state management system like Redux or React State, the DOM itself often acts as the "source of truth."
- Frequency: Very High.
-
Common Use Cases:
- Identifying Elements: Storing a database ID on a button so you know which item to delete when clicked.
-
State Tracking: Storing whether a dropdown is
data-expanded="true"to trigger CSS animations. - Configuration: Passing small bits of configuration from a backend template (like PHP or Liquid) into a script.
3. Usage in React Projects
In the React ecosystem, data attributes are used much less frequently for logic, though they still have a specific place.
- Frequency: Moderate to Low (Logic-wise), High (Testing/CSS-wise).
-
Why the shift? In React, you use State and
Props to manage data. Instead of reading a
data-idfrom the DOM, you typically pass that ID directly into an event handler function using a closure or a bound argument.
Common Use Cases in React:
-
Automated Testing: Developers often add
data-testid="submit-button"so that testing libraries (like Jest or Cypress) can find the element reliably without depending on CSS classes that might change. -
CSS Selection: If you are using "Vanilla Extract"
or standard CSS modules, you might use data attributes for styling states
(e.g.,
data-loading={isLoading}). - Analytics: Tagging elements for third-party scripts (like Google Tag Manager) that look for specific attributes to track clicks.
Comparison Table
| Feature | Plain JS Project | React Project |
|---|---|---|
| Primary Purpose | Storing "State" or IDs in the DOM. | Testing selectors and styling hooks. |
| Data Retrieval | element.dataset |
Handled via State/Props variables. |
| Reliability | High (DOM is the source of truth). | Low (React's Virtual DOM prefers its own state). |
| Standard Practice | Core part of the workflow. | Often replaced by scoped logic. |
Do you have a specific project in mind where you're deciding whether to use a data attribute or a state variable?
Some React libraries use data attributes
Me: In most of React sw code I have seen, all such data is handled using state variables. But some libraries IIRC that i used in React sw used data attributes.
Gem: You are absolutely right to notice that distinction. While application-level developers almost always prefer state, library authors and utility-tool creators rely on data attributes for a very specific reason: they need to bridge the gap between React's internal logic and the "outside world" (the global DOM).
Here is why libraries often break the "state-only" rule:
1. Interaction with Non-React Logic
Many React libraries are actually wrappers around standard JavaScript tools. For example, a tooltip library (like Tippy.js) or a drag-and-drop library (like Dnd-kit) needs to find elements in the DOM that it doesn't "own" through React state.
- The Problem: React state is private to the component tree.
-
The Solution: The library attaches a
data-attribute to the HTML element. This allows the library’s global event listeners to identify exactly which element is being interacted with, regardless of where it sits in the React hierarchy.
2. Styling States (CSS-in-JS)
Modern styling libraries like Radix UI or Headless UI use data attributes to signal the "state" of a component to your CSS.
-
Instead of React managing a complex string of class names like
class="btn btn-active btn-large", the library will output<button data-state="open">. -
This makes your CSS much cleaner because you can target the state directly:
button[data-state="open"] { ... }.
3. Avoiding "Prop Drilling" for Metadata
Sometimes a library needs to pass information down through many layers of components (like a theme or a specific ID) without forcing you to pass props manually every time.
-
By setting a
data-context-idon a parent container, any child—even those not managed by React state—can quickly look up the DOM tree to find that attribute.
4. Accessibility (Aria and Beyond)
While aria-* attributes are the standard for accessibility, many
libraries use data-* attributes to store internal calculation
results (like the coordinates of a popover) that shouldn't be read by a
screen reader but are vital for the library to position elements correctly on
the screen.
Summary of the "Divide"
| Who uses it? | Preferred Method | Why? |
|---|---|---|
| App Developers | State Variables | To keep UI synced with data logic. |
| Library Authors | Data Attributes | To target DOM elements from global scripts or CSS. |
Pro Tip: If you find yourself reaching for a data attribute in your own React code to handle logic, ask yourself: "Can I do this with
useStateoruseContext?" If the answer is yes, stick to state. If you are building a reusable UI component that needs to be styled based on state,data-attributes are your best friend.
Are you working with a specific library right now that's using these attributes, or are you just digging into the architectural "why"?
W3Schools HTML tutorial has limited coverage of data attributes
Me: W3schools HTML tutorial seems to skip data attributes. It only seems to have a short reference page on it - https://www.w3schools.com/tags/att_data-.asp . Can you search and see whether w3schools does have a detailed tutorial/explanation of data attributes?
Gem: After searching, it appears that W3Schools does not have a dedicated, long-form "tutorial" specifically for data attributes like they do for HTML Tables or Forms. The page you found is their primary documentation for this feature.
However, they do integrate information about data-* attributes
across several different sections of their site:
1. The Global Attribute Reference
The page you linked is part of their HTML Attribute Reference. It covers the essential rules:
-
Naming: Must be lowercase and at least one character after
data-. - Value: Can be any string.
- Purpose: Storing custom data private to the page/application.
2. The JavaScript HTML DOM Reference
While there isn't a single tutorial, the logic for
accessing these attributes is detailed in their
JavaScript HTML DOM section under the
dataset property. This is where most of the "how-to"
logic resides.
-
It explains how
data-animal-typein HTML becomeselement.dataset.animalTypein JavaScript (the kebab-case to camelCase conversion).
3. CSS Attribute Selectors Tutorial
W3Schools covers the "styling" aspect of data attributes within
their CSS Attribute Selectors tutorial. They show how to use
[data-active="true"] as a selector to apply styles
without needing extra classes.
Summary for your Blog Post
If you are mentioning this in your blog, you could note that while W3Schools
provides the reference, they treat data attributes as a
Global Attribute (like id or class)
rather than a standalone topic.
For a developer looking for a more "tutorial-style" deep dive, you might suggest looking at MDN (Mozilla Developer Network), which provides a more narrative explanation of the "why" and "when" behind using them. Would you like me to find a more detailed alternative source for your reference?
MDN guide page for data attributes is recommended by Gemini
Me: Is this the main MDN tutorial page for data attributes - https://developer.mozilla.org/en-US/docs/Web/HTML/How_to/Use_data_attributes
Gem: Yes, that is the main and most comprehensive guide MDN offers for data attributes.
While MDN also has a technical reference page for the
data-* attribute itself, the "How to" guide you found
is the one intended for learning and implementation. It is much more detailed
than the W3Schools reference and covers the full "triangle" of
usage:
-
HTML Syntax: How to define them (e.g.,
data-columns="3"). -
JavaScript Access: Using the
datasetproperty (e.g.,article.dataset.columns) andquerySelectorAll. -
CSS Access: Using the
attr()function for content and attribute selectors for styling (e.g.,article[data-columns="3"]).
Why this is the "Best" Tutorial:
-
CamelCase Explanation: It clearly explains that
data-index-numberin HTML automatically becomesindexNumberin JavaScript, which is a common point of confusion. -
Practical Example: It provides a great comparison between
using "class soup" (like
class="callout callout--warning") vs. the cleaner data-attribute approach (class="callout" data-variant="warning"). - Accessibility Warning: It includes a crucial "Issues" section, reminding developers not to store content in data attributes that needs to be visible to assistive technologies (screen readers).
For your blog post, if you want to provide a deeper dive for your readers than the W3Schools snippet, this is the authoritative link to use.
Dynamically generated HTML elements may be more easily targetted via data attributes
Me: About:
Non-ID Attributes: Targeting via data attributes, which we discussed earlier:
document.querySelector('[data-user-id="123"]').
Well if I know that the data attribute is going to be on an element with a particular id, I can first get the element using getElementById() and then using element.dataset.userId (for above case) access the data. Am I on right track?
Gem: You are absolutely on the right track. In fact, combining the two is a very common and efficient pattern in JavaScript.
If you have a unique "anchor" like an ID, it is
almost always better to grab that first. Once you have the element reference,
the dataset object is the most "JavaScript-native" way
to interact with your data.
1. Two Ways to Do the Same Thing
The querySelector approach: This does a search
through the DOM (or a sub-tree) to find an element that matches that specific
attribute string.
const userElement = document.querySelector('[data-user-id="123"]');
Your Proposed approach (Targeted Access): If you already know the ID of the container, this is faster and clearer:
const userElement = document.getElementById("user-profile");
const userId = userElement.dataset.userId; // Accesses data-user-id
2. When Your Approach is Better
Your method of using getElementById + .dataset wins
in these scenarios:
- Readability: It’s very clear which element you are talking about.
-
Multiple Attributes: If you have five different data
attributes on one element (e.g.,
data-user-id,data-role,data-status), you only "find" the element once via ID, and then you just read the properties from thedatasetobject. -
Performance: As we discussed,
getElementByIdis a direct "lookup" rather than a "search."
3. When querySelector becomes necessary
The only time you would have to use the
querySelector('[data-...]') style is when the
data attribute is the only unique thing you have.
For example, if you have a list of 100 users and you need to find the one with
ID 123, but none of the div tags have unique IDs:
<div class="user-row" data-user-id="101">...</div>
<div class="user-row" data-user-id="123">...</div>
<div class="user-row" data-user-id="150">...</div>
In this specific case, you don't have a unique ID to call
getElementById on, so you use:
document.querySelector('[data-user-id="123"]').
Summary Recommendation
-
Known ID? Use
document.getElementById('myId').dataset.myKey. -
No ID, just a unique data attribute? Use
document.querySelector('[data-my-key="value"]').
It seems like your preference for getElementById is leading you
toward writing very efficient code. Do you find yourself using data attributes
more for logic (like tracking user IDs) or for
styling (like toggling a
data-theme="dark")?
Me: Hmm. Interesting! In a plain JS app if I have a dynamically built list of divs where each div is a row, rather than creating a unique id for each div, if I am using a unique data attribute like say data-user-id, then I can pick up that div using querySelector targeting that particular data attribute. ... I am saying plain JS app as in React/Nextjs for such needs of dynamic list, I will simply use some suitable 3rd party component and it will provide me mechanisms to access particular rows.
Gem: You have nailed the architectural trade-off here.
In a Plain JS app, using a data attribute as your
"hook" is much cleaner than generating 100 different IDs like
id="user-row-1",
id="user-row-2", etc. It allows you to keep your IDs
reserved for major layout sections (like #header or
#display) while using attributes for data-driven elements.
The Plain JS Pattern
When you build that dynamic list, your code looks like a "Search and Act" pattern:
-
Search: Use
querySelectorto find the needle in the haystack:const row = document.querySelector('[data-user-id="123"]'); -
Act: Update that specific row:
row.style.backgroundColor = "yellow";
This is very "DOM-centric." You are treating the HTML as your database.
The React/Next.js Shift
You mentioned that in React you'd use a 3rd party component, which is true for complex things like DataGrids, but even in a standard React list you build yourself, the mindset flips from "Search" to "Sync."
-
No Searching: In React, you almost never use
querySelector. You don't "go out and find" the div. -
State Mapping: You update a piece of state (an array of
objects). React notices the object with
id: 123has changed and automatically re-renders just that specific row. -
Refs: If you absolutely need to touch the DOM element
(e.g., to focus a text box), you use a
useRefhook, which gives you a direct variable reference to the element without ever needing an ID or a selector.
Summary of your "Track"
You are on the right track because you are recognizing that Data Attributes are the bridge between your data and your UI in environments where a framework isn't managing that link for you.
- IDs: Use for unique, static structural elements.
- Data Attributes: Use for dynamic, data-driven elements (rows, buttons in a list, etc.).
- Classes: Use for shared visual styling.
It's a very disciplined way to write Vanilla JS. Do you find that keeping your IDs to a minimum makes your CSS and JS feel less "cluttered" as the project grows?
Backend server populates elements with data attributes to identify them
Me: Now I am reviewing this part of the above (section):
2. Usage in Plain JS Projects
...
Identifying Elements: Storing a database ID on a button so you know which item to
delete when clicked.
...
Let us look at Identifying elements use case. in this plain JS app case, it is the backend server that will populate the HTML with suitable data-attributes typically, isn't it?
Gem: You are exactly right. In a Vanilla JS architecture, the backend server acts as the primary "orchestrator" of the initial state.
Because you aren't using a client-side framework to fetch JSON and build the UI dynamically, the server-side template (rendered via PHP, Node/EJS, Python/Jinja, etc.) is responsible for embedding the "metadata" directly into the HTML so the JavaScript knows what to do later.
The Typical Workflow
- Server-Side Rendering: Your server queries the database and gets a list of items.
-
HTML Injection: As the server generates the HTML, it
"stamps" each element with its unique identity using a data
attribute.
-
Example:
<button class="delete-btn" data-item-id="4502">Delete</button>
-
Example:
- Client-Side "Activation": Once the page loads in the browser, your Plain JS script attaches an event listener to that button.
-
Retrieval: When the user clicks, the script reads the
attribute:
const id = event.target.dataset.itemId; // "4502"
Why this is the "Standard" for Plain JS
- Decoupling: The JavaScript doesn't need to know how the ID was generated or what the item's name is. It only needs to know that "when this button is clicked, tell the server to delete ID 4502."
- Simplicity: You avoid having to store a large JavaScript object in memory that mirrors the database. The DOM holds the information until the moment it is needed.
- Backend Flexibility: Since the backend (PHP/Node) is already looping through the data to create the list, it is the most efficient place to "attach" the IDs.
Contrast with React
In your React projects, the server usually just sends a raw
JSON array. React then maps over that array and keeps the ID in its
Virtual DOM state. You don't "look it up" from
the HTML; you simply pass the ID directly into the function:
onClick={() => deleteItem(item.id)}.
For your blog or reference notes, you might describe this as:
"In Plain JS, the HTML is the transport layer for metadata; in React, State is the transport layer."
Does this match how you recall your previous software consulting projects handling data—where the backend did the heavy lifting of preparing the UI?
Me: Great explanation. The React example seems to fit in with the React sw dev work I did. I never worked on a plain JS app that interacts with a database and so your example confirms how I thought such a JS app would data attributes. About event listener for the button, I guess even that can be part of the HTML itself with an onclick attribute. am I right?
Gem: You are exactly right. There are two main ways to "wire up"
that button in a Plain JS application, and using the
onclick attribute directly in the HTML is the more traditional
"server-centric" way.
1. The Inline onclick Approach
In this scenario, the backend server injects both the data and the function call into the HTML.
The HTML (Generated by the Server):
<button onclick="handleDelete(this)" data-item-id="4502">Delete</button>
The JavaScript:
function handleDelete(button) {
const id = button.dataset.itemId;
console.log("Preparing to delete item:", id);
// Proceed with fetch/XHR request to backend
}
-
Why pass
this? By passingthisinto the function, you give the JavaScript a direct reference to the button that was clicked, making it very easy to access thedatasetwithout searching the DOM.
2. The Modern "Separation of Concerns" Approach
Alternatively, you can keep your HTML "clean" of logic and use JavaScript to find the buttons and attach listeners after the page loads.
The HTML:
<button class="delete-btn" data-item-id="4502">Delete</button>
The JavaScript:
// Find all buttons and attach listeners
document.querySelectorAll('.delete-btn').forEach(btn => {
btn.addEventListener('click', (event) => {
const id = event.target.dataset.itemId;
handleDelete(id);
});
});
Comparison for your Blog/Reference
| Feature | Inline onclick |
addEventListener |
|---|---|---|
| Simplicity | Very high; everything is in one place. | Requires an extra step to "find" elements. |
| Server Work |
Server must write the string onclick="...".
|
Server only writes the data-* attribute.
|
| Debugging | Easy to see which function runs by looking at HTML. | Harder; must check JS files for listeners. |
| Scale | Can get messy if logic is complex. | Better for complex apps with many listeners. |
My View
For the kind of "Plain JS" logic you are describing—where the
backend does most of the heavy lifting—the
Inline onclick with this is often
much more intuitive. It mimics the "Component" feel of React where
the behavior is tied directly to the element, but without the overhead of a
framework.
In your React work, you likely used onClick={handleDelete}, which
looks very similar to the inline version, except that React handles the
"wiring" behind the scenes to keep it efficient.
Does the onclick="handleDelete(this)" pattern feel more
aligned with the way you'd structure a simple, server-rendered tool?
State tracking using data attributes
Me: Now about the second example:
State Tracking: Storing whether a dropdown is data-expanded="true" to trigger CSS animations.
Who will set data-expanded to true or false? I guess the browser will not do it and app will need code that does it. am i right?
Gem: You are exactly right. The browser is completely "blind" to the meaning of custom data attributes; it treats them purely as static strings. It is the responsibility of your application's JavaScript to toggle those values in response to user interaction.
This is a classic example of using the DOM as a state machine.
How the Workflow Functions
-
Initial State (HTML): You define the dropdown as closed by default.
<div id="myDropdown" data-expanded="false"> </div> <button onclick="toggleMenu()">Menu</button> -
The Interaction (JavaScript): When the user clicks the button, your code flips the "switch."
function toggleMenu() { const menu = document.getElementById('myDropdown'); // Toggle between "true" and "false" strings const isExpanded = menu.dataset.expanded === "true"; menu.dataset.expanded = !isExpanded; } -
The Visual Result (CSS): The browser sees the attribute change and automatically applies the corresponding CSS rules.
#myDropdown[data-expanded="true"] { display: block; animation: slideDown 0.3s ease-out; } #myDropdown[data-expanded="false"] { display: none; }
Why use data-expanded instead of a CSS Class?
In Plain JS projects, you often see
data-expanded="true" used instead of a class like
.is-open for two reasons:
-
Semantic Clarity:
data-expandedexplicitly describes the state of the component, whereas classes are often seen as purely for styling. -
Boolean-like Logic: It allows you to use very clean CSS
selectors that look like logic gates (
[data-expanded="true"]vs[data-expanded="false"]).
Comparison with React
In your React work, you wouldn't touch the DOM attribute
directly. You would have a piece of state like
const [isExpanded, setIsExpanded] = useState(false).
However, even in React, many developers still choose to output that state to a data attribute so they can keep their CSS logic separate and clean:
// React Example
<div data-expanded={isExpanded.toString()}> ... </div>
Summary for your Reference
- The Browser: Does nothing automatically with data attributes.
- The JS App: Listens for clicks and updates the attribute value (the "Setter").
- The CSS: Watches for that specific attribute value to trigger animations (the "Observer").
Does this clarify the "State Tracking" use case for your documentation? It’s essentially a way to use the HTML element itself as a variable that the CSS can "read."
Passing small bits of configuration using data attributes
Me: Good explanation. understood.
Now about: Configuration: Passing small bits of configuration from a backend template (like PHP or Liquid) into a script.
Please give me an example.
Gem: This is a common pattern used to "bootstrap" a JavaScript application with data that is only known by the server at the moment the page is rendered.
Instead of making a second API call to ask the server "Who is the current user?" or "What is the API key?", the server simply embeds that configuration directly into a data attribute on a main container.
The Example: A Video Player Configuration
Imagine you have a custom video player. The server needs to tell the JavaScript which video to load, whether it should autoplay, and what the user's preferred volume is.
1. The HTML (Generated by the Backend - e.g., PHP)
The backend "stamps" the configuration onto a wrapper div.
<div id="video-player-root"
data-video-id="vid_99283"
data-autoplay="true"
data-user-theme="dark"
data-api-endpoint="https://api.mysite.com/v1">
<div class="spinner">Loading Player...</div>
</div>
2. The JavaScript (Initialization)
When the script runs, it reads the "Config" from the DOM and initializes the application accordingly.
// 1. Get the 'root' configuration element
const playerRoot = document.getElementById('video-player-root');
// 2. Extract the configuration object
const config = playerRoot.dataset;
// 3. Use the configuration
console.log("Loading video ID:", config.videoId);
if (config.autoplay === "true") {
console.log("Autoplay is enabled for this user.");
// code to start video...
}
// Set the API path globally for this session
const apiPath = config.apiEndpoint;
Why is this better than using <script> variables?
Before data attributes were standard, developers used to do this:
var apiEndpoint = "<?php echo $endpoint; ?>";
Using data attributes is considered superior because:
-
Security: It avoids putting raw logic/variables directly
into the global
windowscope where they can be accidentally overwritten. - Valid HTML: It keeps your code as valid HTML5.
-
Encapsulation: If you have multiple video players on one
page, each one can carry its own unique configuration in its own
data-attributes.
Real-World Use Case for You
Since you manage several Blogger sites, you might use this if you were writing a custom widget. The Blogger backend could provide the "Blog ID" or "Post Count" as a data attribute, and your Plain JS script would read that attribute to decide which data to fetch from the Blogger API.
Does this "Configuration" pattern feel like a useful bridge for the type of standalone utilities you build?
Me: Hmm. If I were to write such a plain JS app, I would have used a script variable for such config info. But your points about data attributes being superior seem valid.
Gem: Using script variables (global var or
window properties) was the industry standard for over a decade,
so your instinct is grounded in very common historical practice. However, the
shift toward data attributes for configuration represents a
move toward a more "declarative" and secure way of coding.
Here is a quick look at why the data attribute approach eventually won out in professional development.
1. The "Global Pollution" Problem
When you use a script variable like var apiConfig = { ... }, that
variable is added to the global window object.
- The Risk: Any other script on the page (a third-party analytics tool, a social media widget, or even a browser extension) can read, modify, or accidentally overwrite your configuration.
- The Data Attribute Fix: The configuration is scoped specifically to the HTML element it belongs to. It stays out of the global namespace until your specific script chooses to read it.
2. Supporting Multiple Instances
If you have a script variable, you can usually only have one configuration per page.
- Scenario: Imagine you want two different "Weather Widgets" on one page—one for New York and one for Mumbai.
-
The Script Variable Way: You’d need complex logic like
var config1 = ...; var config2 = ...;and your JS would need to know which is which. -
The Data Attribute Way: You simply have two
<div>tags. Each one carries its own "City" in a data attribute. One JS function handles both perfectly by reading the attributes of the element it is currently working on.
3. Separation of Concerns
Data attributes allow the Backend to provide data without needing to know anything about the JavaScript syntax.
- The backend just needs to know how to write a standard HTML attribute.
- The JavaScript just needs to know how to read the DOM.
-
This is much cleaner than the backend trying to "write" a
JavaScript object string into a
<script>tag, which often leads to syntax errors if the data contains quotes or special characters.
Summary Checklist for your Blog
If you are deciding between a script variable and a data attribute for a future utility:
| Use a Script Variable if... | Use a Data Attribute if... |
|---|---|
| The data is truly global for the whole site. | The data belongs to a specific UI element. |
| You are writing a very tiny "one-off" script. | You want to allow multiple instances on one page. |
You want to keep the global window object clean.
|
Does this help solidify why the data attribute is the more "modern" choice for these standalone components? It really comes down to making your code more "reusable" and "rugged."
Some React libraries using data attributes to identify elements it interacts with
Me: Now about this part:
Many React libraries are actually wrappers around standard JavaScript tools. For example, a tooltip library (like Tippy.js) or a drag-and-drop library (like Dnd-kit) needs to find elements in the DOM that it doesn't "own" through React state.
The Problem: React state is private to the component tree.
The Solution: The library attaches a data- attribute to the HTML element. This allows the library’s global event listeners to identify exactly which element is being interacted with, regardless of where it sits in the React hierarchy.
---
I did not follow the above. Can you give me a more detailed explanation.
Gem: This is one of the more advanced "architectural" uses of data attributes. To understand it, we have to look at the conflict between how React works and how standard JavaScript libraries work.
1. The Conflict: Two Different "Brains"
-
React's Brain: React wants to be the
"manager." It keeps track of everything in a private, virtual
tree. If you want to change a color, you change
state, and React updates the DOM for you. -
The Library's Brain (e.g., Tippy.js): These libraries
often use a "Global Listener." They sit at the very top of the
webpage (on the
documentorwindow) and wait for a mouse to move or click. They don't know anything about React components or state; they only see raw HTML tags.
2. The Problem: "Who are you?"
Imagine you have a list of 50 buttons in React, and you want a tooltip to appear when a user hovers over one.
The Global Listener (the Library) detects a mouseover event on a
button. It needs to know:
- "Is this a button I am supposed to show a tooltip for?"
- "What text should I show in the tooltip?"
The Library cannot "reach into" React's private state to find the answer. It only has access to the physical HTML element the mouse is touching.
3. The Solution: The "ID Badge" (Data Attributes)
This is where the library uses data attributes as a bridge. When the library’s React component renders a button, it "stamps" it with data attributes:
The HTML React produces:
<button data-tippy-content="Delete this file" data-tippy-id="button_1">
Delete
</button>
The Library's Global Logic: Now, when the mouse hovers over that button, the Library does this:
- It catches the event on the
document. - It looks at the
event.target(the button). - It checks:
if (event.target.dataset.tippyId) - If it finds that attribute, it knows: "Aha! This is one of mine. And I see the content should be 'Delete this file'."
4. Why is this better than "Props"?
You might ask: "Why doesn't the library just use React Props?" The problem is that many of these libraries (like Drag-and-Drop) need to track the mouse outside of the React component's boundaries—like when you drag an item across the whole screen.
By putting the data on the HTML element itself:
-
The library can find the element using
document.elementFromPoint()or standard DOM selectors. - The library doesn't have to "re-render" your whole React app just to track a mouse movement.
- It decouples the heavy lifting (the logic of the tooltip or dragging) from the rendering (React).
Summary for your Reference
In this scenario, the data attribute acts as a publicly accessible ID badge. React puts the badge on the element so that a third-party script—which isn't part of the React family—can recognize it and know what to do with it.
Does this help clarify why a "wrapper" library needs to "leave a mark" on the DOM? It's essentially leaving a message for its non-React "engine" to find later.
Me: My understanding has improved with your explanation but if I need to understand it fully I will need to delve into what React allows a library component to query and find. I don't want to get into that detail now. From what you said, it seems that React does not provide as much info. as HTML DOM API provides. So for a third party component, data is more easily accessible via the HTML DOM API rather than what React provides. Thus for third party component, data attributes is a good choice for such 'marker' data embedding on its components. Am I on right track here?
Gem: You are exactly on the right track. You’ve touched on a fundamental truth of web development: The DOM is the "Universal Language," while React is a specific "Dialect."
Here is a breakdown of why your reasoning is correct regarding that "marker" strategy:
1. The "Information Gap"
React creates a Virtual DOM, which is essentially a lightweight JavaScript object that mirrors the real DOM.
- React's limitation: If a third-party library is looking at the page from the outside, it cannot see React's Virtual DOM. It has no way to ask React, "What are the internal props of that div?"
-
The DOM's strength: The standard HTML DOM API is public
and global. Any script can ask the browser,
"Give me all elements that have a
data-tippy-idattribute."
2. The "Bridge" Analogy
Think of a third-party library as a Foreign Inspector visiting a factory (your React App).
- The Inspector doesn't speak the factory's internal language (React State/Props).
- To help the Inspector, the factory workers (React Components) pin Identity Badges (Data Attributes) onto the products (HTML Elements).
- Now, the Inspector can walk through the factory and identify every item simply by looking at the badges, without needing to talk to the factory manager.
3. Why Data Attributes are the Best "Markers"
As you deduced, data attributes are the perfect choice for these libraries because:
- They are Passive: They don't interfere with React's rendering or logic.
- They are Standard: Every browser since 2011 understands how to handle them.
-
They are Specific: A library can use a unique namespace
(like
data-dnd-type) so it doesn't accidentally interfere with another library usingdata-tippy-type.
Summary for your Reference
You have correctly identified that for a library to be interoperable (able to work across different frameworks or even in plain JS), it has to rely on the lowest common denominator: the HTML DOM API. Data attributes provide the most structured way to pass information from the framework "inside" to the library "outside."
You've managed to grasp a very high-level architectural concept without getting bogged down in the complex "Refs" or "Portals" logic of React.
Would you like to wrap up this section of your notes, or is there another part of the "Data Attributes" summary you'd like to double-check?
React Styling libraries like Headless UI using data attributes to signal state to CSS
Me: About:
-
Styling States (CSS-in-JS) Modern styling libraries like Radix UI or
Headless UI use data attributes to signal the "state" of a
component to your CSS. Instead of React managing a complex string of class
names like class="btn btn-active btn-large", the library will
output
<button data-state="open">.
This makes your CSS much cleaner because you can target the state directly:
button[data-state="open"] { ... }.
---
I did use Headless UI in some React app I developed. Perhaps that's where I saw data attributes being used but I am not sure. Give me an example of how a Headless UI component uses data attributes and which the app developer uses/needs to know about.
Gem: You are likely remembering correctly—Headless UI and Radix UI are the biggest drivers of data-attribute usage in the modern React ecosystem.
The core philosophy of Headless UI is that it handles all the "logic" (keyboard navigation, ARIA attributes, and state) but provides zero styling. To allow you to style the components, it exposes the internal state through data attributes.
Example: The Headless UI "Switch" (Toggle)
When you use a Switch component, you don't want to manually
manage a bunch of CSS classes like .toggle-on or
.toggle-off. Instead, Headless UI automatically manages a
data-checked attribute.
1. The React Code
You simply use the component and manage a boolean state.
import { Switch } from '@headlessui/react'
import { useState } from 'react'
function MyToggle() {
const [enabled, setEnabled] = useState(false)
return (
<Switch
checked={enabled}
onChange={setEnabled}
className="my-toggle-switch"
>
<span className="sr-only">Enable notifications</span>
<span className="toggle-handle" />
</Switch>
)
}
2. The Rendered HTML (What the browser sees)
When enabled is true, Headless UI automatically
adds a data-checked attribute to the button:
<button class="my-toggle-switch" data-checked>
<span class="toggle-handle"></span>
</button>
3. What you, the Developer, need to know
Instead of writing complex logic inside your className prop, you
write your CSS (or Tailwind) to "watch" that attribute.
In Plain CSS:
/* Styling the background based on state */
.my-toggle-switch[data-checked] {
background-color: #4ade80; /* green when on */
}
/* Moving the handle based on the parent's state */
.my-toggle-switch[data-checked] .toggle-handle {
transform: translateX(1rem);
}
In Tailwind CSS: Headless UI popularized the
data- modifier. You can write:
<Switch className="bg-gray-200 data-[checked]:bg-green-500"
...>
Why this is better for the App Developer:
-
Cleaner Props: You don't have long strings of
conditional logic like:
className={enabled ? 'bg-green-500' : 'bg-gray-200'}. -
State Synchronization: You don't have to worry about
your CSS getting "out of sync" with the React state. If Headless
UI thinks the switch is on, the
data-checkedattribute will be there. -
Complex States: It gets even more useful for complex
components like Tabs or Listboxes.
Headless UI will manage attributes like:
data-selected(for the active tab)data-hover(when the mouse is over an option)data-active(when the button is being pressed)data-disabled(when the component is locked)
Summary for your Reference
In Headless UI, Data Attributes are the "API" for your styles. The library uses the attributes to "broadcast" what is happening inside the component so that your CSS can react to it without you having to write manual "bridge" code between React and the DOM.
Does this spark a memory of how you styled those Headless UI components in your previous project? Did you use Tailwind or standard CSS?
Me: Yes, IIRC, I used Tailwind CSS and Headless UI together. So, guided by appropriate documentation, I did use something like:
<... className="bg-gray-200 data-[checked]:bg-green-500" ...>
But I did not understand how it was based on data attributes and CSS. Now that picture is becoming clearer.
In Tailwind CSS
data-[checked]
must be equivalent to
CSS selector of [data-checked]
am i on right track here?
Gem: You are exactly on the right track. You have correctly identified the "shorthand" logic that Tailwind CSS uses to bridge the gap between HTML attributes and visual styles.
The Direct Comparison
In the world of CSS, those two expressions are functional twins. Tailwind simply wraps the standard CSS selector syntax into a more compact "utility" format.
| Tailwind Syntax | Equivalent Standard CSS Selector |
|---|---|
data-[checked] |
[data-checked] |
data-[state=open] |
[data-state="open"]
|
data-[orientation=vertical] |
[data-orientation="vertical"]
|
How it Works Under the Hood
When you write data-[checked]:bg-green-500 in your React code,
Tailwind generates a CSS rule in your stylesheet that looks something like
this:
/* Tailwind's generated CSS */
.data-\[checked\]\:bg-green-500[data-checked] {
background-color: #4ade80;
}
This rule essentially says:
"Apply this green background only if the element has both this
specific class name and the
data-checked attribute is present."
Why this is a "Power Move" for Headless UI
Because Headless UI components (like the Switch or
Tab) manage the attribute for you automatically,
you don't have to write any "if/else" logic inside your
className string.
Without the data- modifier (Manual Logic):
// You have to manually check the state to decide the class
<Switch className={enabled ? "bg-green-500" : "bg-gray-200"} />
With the data- modifier (Declarative Logic):
// You let the component's state drive the CSS via the attribute
<Switch className="bg-gray-200 data-[checked]:bg-green-500" />
Summary for your Reference
-
Tailwind's
data-[]is a dynamic modifier that allows you to target any data attribute without leaving your HTML file. - Headless UI provides the "State" by toggling the attribute.
- Tailwind provides the "Style" by watching that attribute.
It’s a very elegant "handshake" between your logic and your design. Now that you see the CSS selector behind the Tailwind shorthand, does it make the "magic" of Headless UI feel a bit more grounded in standard web tech?
Me: Great explanation. The 'magic' of Headless UI and TW CSS usage is now much less of magic and more of web tech usage. Hmm. Fascinating to see how data attributes are used in this use case in React apps.
React library using data attributes for metadata avoiding prop drilling
Now moving to:
3. Avoiding "Prop Drilling" for Metadata Sometimes a
library needs to pass information down through many layers of components (like
a theme or a specific ID) without forcing you to pass props manually every
time.
By setting a data-context-id on a parent container, any child—even those not managed by React state—can quickly look up the DOM tree to find that attribute.
---
Hmm. Fascinating! So any component in a React app whether library or not, has this ability to get global data set/get functionality outside of React state by using data attributes on a known high level parent container. Am I right?
Gem: You have hit on a very clever "escape hatch" in web architecture. You are exactly right: while React tries to keep everything in a strict, top-down hierarchy of props, the DOM tree exists alongside it as a parallel, global structure.
By using data attributes on a high-level container, you are essentially using the HTML itself as a "Physical Context" that any component can reach out and touch.
1. The "Look Up" Strategy
In React, if a child needs data from a parent, the parent must pass it down
(Prop Drilling) or use a Context Provider.
However, any component—even one deep in the basement of your app—can bypass React's rules and look up the physical DOM tree to find a "marker" on a parent:
// A component deep in your React tree
const handleClick = (event) => {
// Find the closest parent that has our special data attribute
const container = event.target.closest('[data-app-tenant-id]');
const tenantId = container.dataset.appTenantId;
console.log("Found tenant ID from the DOM tree:", tenantId);
};
2. Why is this useful in React?
You might wonder, "Why not just use React Context?" There are three specific reasons why developers sometimes prefer the data-attribute "look up":
- Integration with Non-React Code: If your React app uses a legacy jQuery plugin or a third-party charting library (like D3.js) that lives "outside" of React, that library can't read React Context. It can, however, see the data attributes on the HTML.
-
Performance (Avoiding Re-renders): When React Context
changes, every component listening to it re-renders. If you only
need a piece of metadata (like a
data-theme="dark") for a quick CSS calculation or a analytics event, reading it from the DOM is "free"—it doesn't trigger any React lifecycle events. - Decoupling Components: It allows a child component to be used inside different "containers" without needing to be "wrapped" in a specific Provider every time. It just looks up and says, "Am I inside a container with this ID? Yes? Then I'll use it."
3. The Trade-off (The "Peers" Warning)
As a former software consultant, you'll recognize the trade-off here: Dependability vs. Convenience.
- The Risk: If you rely on this too much, your React components become dependent on the HTML structure. If you move a component outside of that parent container, it will suddenly "break" because it can no longer find the attribute it was looking for.
- The React Way: React prefers "Explicit" data (Props/Context) because it makes the app more predictable.
- The "Escape Hatch" Way: Data attributes are "Implicit." They are there in the background, waiting to be found.
Summary for your Reference
You are correct that this provides a Global Get/Set functionality outside of React state. It is essentially a way to treat the Browser's DOM as a shared database that is accessible to everyone on the page, regardless of whether they are a React component, a plain JS script, or even a CSS file.
Does this concept of the "DOM as a shared database" change how you view the relationship between your React components and the actual HTML they produce?
Data attributes store data in DOM itself in contrast to JavaScript variables
Me: Good point about React component re-use breaking if data attributes in parent container are used/expected. Noted. But overall, data attributes seem to be a mechanism to store data right in the DOM itself as against storing it in JavaScript variables. And this data can be quite varied as data attribute naming system is perhaps as flexible as Javascript variable naming system. Is my overall view correct?
Gem: You are spot on. Your summary perfectly captures the shift in perspective from "Code-managed data" to "DOM-embedded data."
To refine your "Correct View," here are a few final technical nuances that solidify that comparison between data attributes and JavaScript variables:
1. The Naming Flexibility
You are correct—it is incredibly flexible. The only real rules are:
- It must start with
data-. -
It should not contain uppercase letters in the HTML (e.g.,
data-userIdwill be treated asdata-userid). -
The CamelCase Trick: JavaScript automatically converts
kebab-case in HTML to camelCase in the
datasetobject.- HTML:
data-last-login-time - JS:
element.dataset.lastLoginTime
- HTML:
This makes it feel exactly like working with a standard JavaScript object.
2. The "Global Database" vs. "Local Variable"
Think of your JavaScript variables as RAM (fast, but private to the script) and your Data Attributes as a Public Ledger (accessible to anyone—CSS, other scripts, or the browser's own Inspector).
3. Key Differences (Where the variable naming analogy ends)
While the naming is flexible, there are two areas where data attributes differ from variables that you should keep in mind for your documentation:
-
Type Casting: In JavaScript, a variable can be a
Number,Boolean, orObject. In the DOM, everything is a String.-
If you store
data-count="5", you must useNumber(element.dataset.count)before doing math.
-
If you store
- Visibility: Data attributes are visible to the end-user. If they "Inspect Element," they can see your data. This is why you use them for Metadata and Configuration, but never for Secrets (like passwords or private keys).
Your Final Conclusion
Your view that data attributes are a mechanism to store varied data directly in the DOM is exactly how a senior architect looks at them. They are the "Glue" of the web—they allow the HTML, the CSS, and the JavaScript to all share the same piece of information without having to "talk" to each other constantly.
Does this conclude your deep dive into Data Attributes, or would you like to see how they might look in a real-world Blogger widget or utility?
Data attributes are more easily visible in Chrome Inspector/DevTools as compared to JavaScript variables
Me: U said:
Data attributes are visible to the end-user. If they "Inspect Element," they can see your data.
But aren't JavaScript variables too accessible to the end-user through Chrome Inspect (debugger)?
Gem: You are absolutely right. In the world of web development, nothing on the client-side is truly "hidden" from a determined user.
However, there is a significant difference in discoverability and persistence between the two. Here is how they compare in terms of accessibility:
1. Discoverability (The "Casual" vs. "Deep" Look)
- Data Attributes: These are "front and center." Anyone who right-clicks an element and selects Inspect sees them immediately in the static HTML tree. It requires zero technical knowledge of how your code works to find them.
-
JavaScript Variables: These are "buried" in the
memory of the browser. To find a specific variable, a user usually has to:
- Open the Sources tab.
- Find the right script file (which might be minified/obfuscated).
- Set a breakpoint or search the Console for specific global objects.
2. State vs. Snapshot
-
Data Attributes: Represent the
current state of the DOM. If your script changes a
data-statusfrom "loading" to "complete," the Elements tab updates in real-time. It’s like a public scoreboard. -
JavaScript Variables: Local variables inside functions
(closures) are only accessible while that function is running. Unless
you've made a variable global (attached to
window), it’s actually quite difficult for a casual user to "catch" and read it without using the Debugger.
3. The "Security" Misconception
Your point highlights a very common pitfall: Never trust the client. Since a user can modify a data attribute or a JavaScript variable via the console:
-
Client-side: You can't prevent them from seeing
data-user-role="admin". -
Client-side: You can't prevent them from changing
isAdmin = truein the console.
The Golden Rule: Regardless of whether you use variables or data attributes, Security must happen on the Server. The client-side is purely for "User Experience." The server must always re-verify that the user actually has the permissions they claim to have.
Summary Comparison
| Feature | Data Attributes | JS Variables (Global) | JS Variables (Local/Private) |
|---|---|---|---|
| Visibility | Public (Elements Tab) | Accessible (Console) | Hidden (Requires Debugger) |
| Ease of Editing | Very Easy (Double-click) | Easy (Console command) | Hard (Breakpoint injection) |
| Best For | Metadata, State, CSS hooks | Application Logic | Sensitive temporary logic |
So, while both are "accessible," data attributes are essentially "publicly posted," whereas JavaScript variables are "stored in a filing cabinet"—still reachable if you have the key (the Debugger), but not visible to someone just glancing at the room.
Does this distinction between "visibility" and "accessibility" align with how you've approached security in your previous software projects?
Me: Excellent response.
Small data-attributes.html sample test file
[I wrote a small data-attributes.html sample test file and uploaded it to Gem. After some iterations ...]
Gem: I have reviewed the updated data-attributes.html file. Your
addition of the CSS attribute selector completes the "triangle" of
usage perfectly: defining in HTML, accessing in
JavaScript, and styling with CSS.
Review of the CSS Implementation
Your CSS rule successfully demonstrates how the browser acts as an "observer" of data attributes:
article[data-columns="4"] {
color: red;
}
-
Initial State: On page load, the "Flying car"
article is immediately red because it starts with
data-columns="4". -
Dynamic Response: When you click the button to change the
"Electric car" columns to
"4", the browser detects the attribute change and automatically applies the red color to that element as well.
This is a powerful demonstration because it shows that you don't need to
write extra JavaScript logic like
element.style.color = 'red'. You simply update the
data state, and the CSS handles the
visual state.
Final Documentation Summary
With this file, you have captured the three core pillars of data attributes:
| Pillar | Action in your Code |
|---|---|
| 1. Storage |
Defined metadata (data-columns, etc.) directly on the
<article> tags.
|
| 2. Manipulation |
Used article.dataset.columns = "4" to change the
state via JavaScript.
|
| 3. Presentation |
Used article[data-columns="4"] in CSS to change
the UI based on that state.
|
Minor Tip for Scannability
In your updateDisplay() function, you used
display.innerHTML. While perfectly fine for this sample, for your
future plain JS projects involving user-generated content, remember that
.textContent or .innerText is generally safer to
prevent XSS (Cross-Site Scripting), though in this specific case, since you
are controlling the strings, innerHTML is fine for the
<br> tags.
This sample is now a "gold standard" reference for your technical blog. Do you feel this concludes the study on data attributes, or would you like to see how to handle multiple values inside a single data attribute (like a JSON string)?
textContent needs newline escape character and CSS white-space pre-line for line breaks
Me: I had used textContent earlier but tripped up on <br>.
do I have to use \n to get a line break in textContent?
[CSS of white-space: pre-line; has to be used.]
XSS security vulnerability of innerHTML
Me: The crucial factor is the source of the string used with innerHTML. If the source is always based on app created content then it is safe. If the source could have user specified content (like from an input field) then there is a risk. Have I captured the issue in a nutshell accurately?
Gem: You have captured the issue perfectly. That is the exact "litmus test" used by security-conscious developers.
To refine that "nutshell" even further for your technical notes, you can think of it as a rule of Trust:
-
Trusted Sources (Safe for
innerHTML): Strings hardcoded in your script, data from your own database, or mathematical results generated by your logic. -
Untrusted Sources (Risk for
innerHTML): Anything that originates from a keyboard, a URL parameter (?search=...), or a public API.
A Practical "Sanitization" Tip
If you ever find yourself in a situation where you must use
innerHTML but part of the string contains user data (for example,
displaying a user's name in a bold header), you can "sanitize"
just that one piece:
// Dangerous if userName is "<img src=x onerror=alert(1)>"
element.innerHTML = `Welcome, <b>${userName}</b>`;
// Safe approach: Create a temporary element to escape the name
const cleanName = document.createElement('div');
cleanName.textContent = userName;
element.innerHTML = `Welcome, <b>${cleanName.innerHTML}</b>`;
By using textContent on a temporary element first, the browser
converts any dangerous characters like < and
> into HTML entities (< and
>), making them harmless before they ever touch your
innerHTML.
But for your current project and your sample file, your assessment is 100%
correct: since you are the "author" of the data attributes,
innerHTML is a tool of convenience with zero actual risk.
...
innerHTML security flagging is a problem. textContent avoids that
Me: OK. As I thought about it, I realize that innerHTML usage always raises a security flag. That itself is a hassle even if we are sure that that particular usage is safe. Web security is such a sensitive topic. It is best not to get any security flags. So it is best to not use innerHTML and use textContent along with \n for new lines and CSS of white-space: pre-line for such simply multi-line display purposes.
For special cases of rendering complex HTML, innerHTML may be unavoidable to use after taking adequate security precautions, and living with security flags that will be raised as we know that we have taken care of security issue. Ur. view pl.
Gem: That is a very pragmatic and professional conclusion. In the software industry, "Security Hygiene" is often as important as security itself.
Even if a piece of code is technically safe, if it triggers an automated
security scanner (like SonarQube or Snyk) or raises a red flag during a peer
review, it costs time to justify and document that exception.
By defaulting to textContent, you are writing
"self-defending" code that doesn't require an explanation.
Summary of the Best-Practice Strategy
| Approach | Recommendation | Reasoning |
|---|---|---|
| Primary (Default) |
textContent + \n + pre-line
|
Zero security risk; clean "Security Hygiene"; no flags. |
| Secondary (Complex UI) | innerHTML + Sanitization |
Use only when you need nesting (e.g., tables, lists, or bolding). |
| The "Escalation" | Web Components / Templates |
For very complex HTML, use the <template> tag to keep
HTML out of JS strings.
|
My View on "Living with Flags"
You mentioned that for complex cases, one might have to live with the flags. While true, there is a "middle ground" that senior developers use to keep the security scanners happy even when they need to render HTML:
-
The "Creation" Method: Instead of using a giant
innerHTMLstring, you can build the elements one by one. It’s more verbose but creates zero security flags:const div = document.createElement('div'); const b = document.createElement('b'); b.textContent = "Bold Title:"; // Safe div.appendChild(b); div.append(` ${article.dataset.indexNumber}`); // Safe display.appendChild(div); -
Sanitization Libraries: If you have a massive amount of HTML to render, using a library like DOMPurify tells the world (and the scanners) that you aren't just being lazy—you are actively scrubbing the data.
Final Thoughts for your Blog
Your evolution through this topic mirrors the industry's own history:
-
Early days:
innerHTMLwas used for everything because it was easy. -
The "Crisis" era: Massive XSS attacks led to a
fear of
innerHTML. -
Modern era: We use
textContentas the "Safe Default" and reserveinnerHTMLfor specific, sanitized, and documented architectural needs.
Adopting textContent + white-space: pre-line as your
"standard" for multi-line displays is a sign of a disciplined
developer who values robustness over
shortcuts.
Does this wrap up your "HTML Refresher" for the day, or is there another part of the DOM API you want to put under the microscope?
Use textContent with specific CSS instead of innerHTML for multi-line content to avoid security flagging
Me: Give me a new summary note for my blog about avoiding innerHTML and using textContent with white-space: pre-line to simply avoid security scanners flags.
Gem: This is a very practical, "security-first" approach. By choosing a standard that is safe by design, you eliminate the need to constantly defend your code against automated security tools or manual audits.
Here is a summary note tailored for your blog that captures this shift in strategy.
Technical Strategy: Prioritizing "Security Hygiene" in DOM Updates
In modern web development, security hygiene—writing code that
is inherently safe and does not trigger security warnings—is as important as
functional correctness. While innerHTML is a convenient way to
render multi-line content, it is often flagged by security scanners (like
SonarQube or Snyk) as a potential
Cross-Site Scripting (XSS) vulnerability.
To avoid these flags and maintain robust security, a better default pattern is
to use textContent combined with specific CSS.
The Recommended Pattern: "Safe Multi-line Display"
Instead of using innerHTML to parse <br> tags,
use textContent to inject raw strings and let CSS handle the
formatting.
1. The JavaScript (Safe Injection) Use the standard newline
character (\n) to define where line breaks should occur.
const display = document.getElementById("display");
// This string is treated as raw text, making it 100% XSS-proof
display.textContent = `Column Count: ${data.columns}\nIndex:
${data.index}\nParent: ${data.parent}`;
2. The CSS (The Layout Bridge) By default, browsers collapse
newlines in HTML. You must tell the browser to respect the
\n characters by using the white-space property.
#display {
/* Respects \n characters and wraps text naturally */
white-space: pre-line;
}
Why This is the Superior Default
-
Zero Security Flags: Automated scanners see
textContentand immediately move on. It is an "opt-in" for security that requires no further justification or sanitization libraries (like DOMPurify). -
Performance:
textContentis slightly faster thaninnerHTMLbecause it does not trigger the browser's HTML parser. - Separation of Concerns: You keep your data manipulation in JavaScript and your visual formatting (how lines are broken and spaced) in CSS.
The Exception Rule
Reserve innerHTML only for cases where you must render complex,
nested HTML (like tables or bolded segments) from a
trusted source. In those rare instances, be prepared to
document your security precautions, as the code will likely (and rightly) be
flagged for review.
Gemini chat: 5-HTML Refresher n Study
Date: 29 April 2026
HTML Notifications API are for system notifications and so useful only when combined with service worker
HTML Notifications API - MDN Documentation . These are system notifications and not in-app notifications.
Source code for example in above MDN Docs
Gemini said that the above example doesn't use a service worker. So the notifications are tied to the lifecycle of the tab. If you close the To-Do list page, the app stops running, and you will not receive any alerts for your tasks.
I confirmed that that is the case. That makes the demo app quite an odd one. Further, its UI does not say clearly that the tab has to be kept open for task reminder notification for future tasks to be shown.
The takeaway seems to be that HTML notifications API must be paired with something like service workers for it to be useful in almost all real life plain JS apps. Exceptions are Active User Engagement Tools like a web-based pomodoro timer or a browser-based IDE. In these cases, the user expects the app to be open to work.
Gemini: You are exactly on the right track. In the world of "plain JS" web development, the Notifications API is essentially half of a bridge; the Service Worker is the other half that makes it reach the "real world."
To solidify your takeaway, think of the architecture in two distinct tiers:
Tier 1: The "Fragile" Implementation (The MDN Example)
This is what you just tested. It lives entirely within the Window/Tab context.
- Mechanism:
new Notification() - Dependency: The tab must be active.
- Real-life Use Case: Almost none. It’s mostly used for learning or very niche "in-the-moment" desktop apps that users never close.
Tier 2: The "Robust" Implementation (Real-Life Apps)
This moves the logic into the Service Worker context.
- Mechanism:
registration.showNotification() - Dependency: The browser must be running, but the specific website can be closed.
- Real-life Use Case: WhatsApp Web, Gmail, Calendar, and any PWA (Progressive Web App).
Why the Service Worker is the "Secret Sauce"
A Service Worker is a script that the browser runs in the background, separate from a web page. It’s like a "mini-server" living in your browser.
When you pair them, the workflow becomes reliable:
- The User Closes the Tab: The main JS app stops, but the Service Worker stays "registered" in the browser’s memory.
- The Event Happens: Either a timer inside the Service Worker fires, or a Push Message arrives from a server (like Firebase).
- The Wake-Up: The browser "wakes up" the Service Worker script.
- The Delivery: The Service Worker calls the Notification API to show the alert.
The "Plain JS" Reality
If you are building a plain JS app today and you want it to be "useful" for reminders:
- You MUST use a Service Worker.
- You should learn the
self.registration.showNotification()syntax, which is slightly different from thenew Notification()constructor you saw in the MDN code. - You need a Manifest file (
manifest.json) so the browser treats your site as an installable app, which grants more reliable background privileges.
Summary Checklist
| Goal | Tech Needed |
|---|---|
| Simple In-App Alert | Custom CSS + <div> (Don't use Notification API) |
| Real Reminder (Tab Open) | Notifications API (The MDN example) |
| Real Reminder (Tab Closed) | Notifications API + Service Worker + Push API |
IndexedDB
Me: w3schools HTML tutorial does not cover IndexedDB IIRC. It covers local and session Storage. Ur comments pl.
Gem: You are absolutely correct. W3Schools generally sticks to the "Essential" or "Easy" parts of the web stack, and IndexedDB is famously neither.
W3Schools focuses on LocalStorage and SessionStorage because they are "Key-Value" stores that use a very simple string-based API. You can learn them in five minutes. IndexedDB, on the other hand, is a full-blown transactional database living inside your browser.
Here is why most tutorials (including W3Schools) treat them differently:
1. The Complexity Gap
- LocalStorage: You just say
localStorage.setItem('name', 'Ravi'). It's synchronous and dead simple. - IndexedDB: It is asynchronous. You have to open a database, handle "onupgradeneeded" events, create "object stores," start a transaction, and handle success/error callbacks. It feels more like writing backend SQL or NoSQL code than frontend JS.
2. The "What it's for" Distinction
| Feature | Local/Session Storage | IndexedDB |
|---|---|---|
| Capacity | Very small (usually 5MB - 10MB) | Large (often hundreds of MBs or % of disk) |
| Data Types | Strings only (must JSON.stringify objects) |
Complex Objects, Blobs, Files, Images |
| Search | No indexing. You must loop through everything. | Supports Indexes for high-performance searching. |
| Use Case | User preferences, "Dark Mode" toggle, auth tokens. | Offline apps, caching large datasets, "To-Do" apps. |
Why the MDN To-Do example used IndexedDB
The MDN example you were looking at earlier used IndexedDB specifically because a "To-Do" list is a collection of objects.
- If you use LocalStorage for a To-Do list, every time you add one item, you have to pull the entire list out, parse it, add the item, stringify it, and shove it back in.
- With IndexedDB, you just "insert" one record, exactly like a database.
My Recommendation for your Refresher
Since you are refreshing your skills, here is the "honest" landscape of browser storage in 2026:
- LocalStorage: Still the king for tiny bits of data.
- IndexedDB: Essential if you want to build Progressive Web Apps (PWAs) or anything that works offline. However, nobody uses the "Raw" IndexedDB API (the one in the MDN example) because it’s too verbose.
- The Modern Way: Most developers use a "wrapper" library like Dexie.js or idb. These libraries give you a clean, Promise-based syntax (like
db.friends.add(...)) while using IndexedDB under the hood.
W3Schools likely skips it because the "Raw" API is intimidating for beginners and requires a deep understanding of asynchronous JS. MDN, being a technical reference, includes it because it’s a core browser standard.
Note: This part is later on in the chat, after the cookies section.
Video: IndexedDB - What is it, and when you should choose it
Me: Saw this video. Quite detailed look at IndexedDB ... IndexedDB - What is it, and when you should choose it, https://www.youtube.com/watch?v=-AzFQN9Vp7k, 9 min. 21 secs, Jan. 2022.
But I think IndexedDB is clearly an advanced part of HTML. Strangely none of apps I studied in React or Plain JS used IndexedDB though some of them heavily used local storage.
Gem: You've hit on a common reality in modern development: IndexedDB is indeed the "Heavyweight Champion" of browser storage, but most apps simply don't need its power.
According to the video you shared, while every frontend developer should know it exists, it is only intended for specific, high-performance scenarios.
Why You Rarely See It in Tutorials or Small Apps
The video highlights a "Complexity Gap" that explains why LocalStorage is the default for most React and Plain JS tutorials.
- API Complexity: IndexedDB's raw API is extremely verbose and complex compared to the one-line simplicity of
localStorage.setItem(). - Asynchronous Nature: Unlike LocalStorage, which is synchronous, IndexedDB is almost entirely asynchronous. This means you have to deal with callbacks or promises just to save a single piece of data.
- Versioning Requirements: Because it is a real database, developers must manually handle database versioning and schema updates whenever the app is updated—a headache that LocalStorage doesn't have.
When Should You Actually Choose IndexedDB?
The video suggests you should only use IndexedDB if you actually need a database and not just a key-value store.
| Feature | LocalStorage | IndexedDB |
|---|---|---|
| Storage Limit | ~5MB to 10MB | ~2GB+ (depending on disk space) |
| Data Types | Strings only | Objects, Blobs, Files, and Images |
| Performance | Synchronous (can block UI) | Asynchronous (generally won't block UI) |
| Searchability | None (must loop through all keys) | High-performance searching using indexes |
| Reliability | No built-in safety | Transactions ensure data integrity |
Real-World Use Cases Mentioned in the Video
- Significant Data: Storing large amounts of data, including actual files or binary "Blobs."
- Offline-First Apps: Combined with Service Workers, it is the engine that allows apps to work completely offline.
- Background Processing: It works perfectly with Web Workers, which LocalStorage cannot do.
- Syncing Large Edits: The video mentions that Google Docs uses IndexedDB to store every character change locally before syncing it to the server.
The Video's Final Tip
The creator strongly recommends that if you do use IndexedDB, you should use a wrapper library (like Dexie.js or idb) to turn the clunky syntax into clean Promises. But if your app only needs to remember a simple state or a few user preferences, stick to LocalStorage for much better results and simpler code.
It makes perfect sense that your "leisurely" projects haven't touched it yet—you only pull out the "database" when you stop building a webpage and start building a high-performance Application.
Official Google Video: IndexedDB - Progressive Web App Training
Me: Official Google video. Gives a good programming level overview ... IndexedDB - Progressive Web App Training, https://www.youtube.com/watch?v=VNFDoawcmNc, around 9 mins., Jul 2019. What's ur view of this video?
Gem: This official Google video from the Chrome for Developers channel provides a highly structured and technically deep look at IndexedDB, specifically tailored for building Progressive Web Apps (PWAs).
Key Technical Takeaways
The video confirms your suspicion that IndexedDB is a powerful but complex tool, emphasizing that it is an Object Store rather than a relational SQL database.
- Data Types: It can store strings, numbers, arrays, and even files—essentially anything that can be "cloned" in JavaScript.
- The Syntax "Trap": The speaker, Sarah Clark, warns that the "raw" IndexedDB API is "tricky to work with" because it is an older, event-based API using callbacks.
- The Recommended Shortcut: Just like your previous video noted, this Google training strongly recommends using a wrapper library. They specifically highlight idb (IndexedDB Promised), written by Google's Jake Archibald, which converts the clunky events into modern Promises.
- Structural Requirements:
- Upgrades: Changes to Object Stores can only happen during a specific "upgrade callback" that fires when the database version increases.
- Transactions: Every operation (Create, Read, Update, Delete) must be wrapped in a transaction to ensure data integrity and prevent simultaneous writes.
Important Concept: The Cursor
One part of this video that goes deeper than your previous study is the explanation of a Cursor.
- While
getAll()is simple, it pulls everything into an array at once. - A Cursor selects objects sequentially, which is more memory-efficient for massive datasets because it allows you to process items one by one.
Gemini's View: Why This is a "PWA Training" Essential
The reason this video is labeled "PWA Training" is that IndexedDB is the only reliable way to build a "Searchable" offline experience.
- Static Assets (HTML/CSS/JS) go into the Cache Storage.
- Dynamic Data (User profiles, task lists, message history) goes into IndexedDB.
Is This Right for Your Refresher?
This video is excellent because it provides the actual logic (Transactions, Object Stores, Indices) rather than just a high-level "what is it." However, even for an expert developer, the "raw" code demonstrated is a great argument for why you should stick to a library like Dexie.js or idb if you ever decide to move a plain JS app to the "Tier 2" persistent level.
Cookies rarely used by App Dev nowadays but HttpOnly Cookie used for better security for tokens like JWT
Me: I can't recall using cookies in my plain JS apps - all of them IIRC did not need auth or session cookies. In React, IIRC, the cookies access/setting was wrapped in some methods of components. Ur view pl.
Gem: It makes perfect sense that you haven't touched them much. For the type of standalone, utility-focused "plain JS" apps you’ve been building—like your BloggerAllPostsLister or the Timestamp Tracker—cookies are almost entirely unnecessary.
Local storage is much cleaner for saving local preferences because it doesn't involve the overhead of sending data back and forth to a server.
1. The "Vanilla" vs. "React" Cookie Experience
You are exactly right—the "Manual" way is so messy that almost everyone wraps it in a library or a helper method.
In Plain JS (The "Messy" Way): To set a cookie, you have to write a long, specifically formatted string. To read it, you have to "search" a giant string.
// Setting a cookie manually document.cookie = "username=Ravi; expires=Thu, 18 Dec 2026 12:00:00 UTC; path=/"; // Reading a cookie manually (The "W3Schools" way) function getCookie(cname) { let name = cname + "="; let ca = document.cookie.split(';'); // You have to split the string yourself! // ... more loops and logic to find the right key ... }In React (The "Wrapped" Way): As you recalled, in React we usually use a hook or a utility like
react-cookie. It hides all that string parsing behind a simple object.const [cookies, setCookie] = useCookies(['user']); // You just use it like a regular JS object: console.log(cookies.user);
2. Why we "abandoned" Cookies for UI state
In the early days of the web, cookies were the only way to save data. If you wanted a "Dark Mode" toggle to stay active when a user refreshed the page, you had to use a cookie.
Today, we use LocalStorage for 99% of "leisurely" app state because:
- API is cleaner:
localStorage.getItem()vs. parsing a semicolon-delimited string. - Privacy: Cookies are often blocked by privacy extensions or "Incognito" modes more aggressively because they are used for tracking.
- No Server Bloat: Since your apps are mostly client-side tools, there is no reason to force the browser to send your app settings to a server on every request.
3. The "Legacy" Check
Since you are doing an HTML/JS refresher, it is worth knowing that document.cookie exists, but you should treat it like a specialized tool for authentication.
If you are building a tool that doesn't have a login screen, you can safely ignore cookies and stick to localStorage or indexedDB.
A quick tip for your refresher: If you ever want to see what cookies a site is using, just open the F12 DevTools, go to the Application tab, and look under Storage -> Cookies. You'll likely see a long list of tracking and session IDs that have nothing to do with the app's actual functionality!
Does that align with your memory of how React handled it? It's basically just a "sugar-coated" version of that messy document.cookie string.
Me: Even in cases of auth, like say a Next.js app logging in to the backend server, I don't recall using any cookie related method. In all probability, next.js auth handled that via some wrapper.
Gem: You are spot on. If you were working in the Next.js ecosystem (especially with NextAuth.js or Auth.js), the reason you don't recall writing any cookie code is that the library handles the entire "Cookie Lifecycle" for you as a security feature.
In a modern framework like Next.js, manually touching cookies is actually discouraged because it’s easy to make a security mistake. Here is how that "invisible" process usually works:
1. The "Invisible" Handshake
When you call a function like signIn('credentials') in NextAuth:
- The Library sends the request to the server.
- The Server validates the user and generates a token (JWT).
- The Library sends back a
Set-Cookieheader in the HTTP response. - The Browser sees that header and automatically saves the cookie.
You never had to write document.cookie = ... because the browser and the framework did the handshake behind the scenes.
2. The Security Benefit: HttpOnly
The biggest reason frameworks wrap cookie management is to enable HttpOnly and Secure flags.
- If you set a cookie via JavaScript (
document.cookie), any malicious script on your page can read it. - If the framework sets it via the server header as
HttpOnly, JavaScript cannot see it at all. This is likely why you don't remember "accessing" the cookie—in many secure Next.js apps, the cookie is literally invisible to your frontend code! It just sits in the browser's "vault" and is automatically sent back to the server with every request.
3. "Session" vs. "Cookie"
In React/Next.js, we usually talk about the Session Object.
- You code with:
useSession()orgetServerSession(). - The Framework uses: The cookie to find the data for that session.
You are interacting with the "High-Level" data (the user's name, email, and image), while the framework handles the "Low-Level" plumbing (the cookie string).
Summary for your Refresher
As a retired consultant looking back at these modern tools, the evolution looks like this:
| Era | How we handled Auth | Manual Effort |
|---|---|---|
| Plain JS / PHP | document.cookie or $_COOKIE |
High (Manually parsing strings) |
| Early React | react-cookie or js-cookie |
Medium (Better API, but still manual) |
| Next.js / Modern | NextAuth / Clerk / Lucide | Zero (Abstraction layers handle it) |
It’s a classic case of "Abstraction." The cookie is still there doing the hard work, but the framework has "paved over" the messy details so you can focus on the application logic.
Does this trigger any memories of checking the "Application" tab in Chrome DevTools to see if a session was active, even if you hadn't written a single line of cookie code?
Google services access: API Key and OAuth 2.0 / OpenID
Me: In my https://github.com/ravisiyer/BloggerAllPostsLister app IIRC, I only use a Google API key for Blogger API (v3, IIRC). I did not need a session variable to be stored in a cookie. Ur comments pl.
Gem: That is exactly right, and your memory of the technical architecture is spot on. The reason you didn't need a session or a cookie in that specific app is due to the difference between API-based access and User-based Authentication.
Why Your App Bypassed Cookies
In a tool like your BloggerAllPostsLister, you are essentially acting as a "reader" or a "fetcher" of public data.
- The Google API Key: This is a simple identifier for the project. It tells Google, "This is Ravi's app asking for data." It doesn't prove who you are as a person; it just authorizes the app to make a certain number of requests to public Blogger data.
- Public Data Access: Since you are likely listing posts from public blogs, Google doesn't require a "User Login." Therefore, there is no "Session" to maintain. No session = no need for an auth cookie.
The "Fork in the Road" for Google APIs
When working with Google services, you usually take one of two paths:
| Feature | API Key (What you used) | OAuth 2.0 / OpenID (What NextAuth uses) |
|---|---|---|
| Purpose | To identify the application. | To identify the specific person (Login). |
| Data Access | Public data only (e.g., public blog posts). | Private data (e.g., drafting a post, Gmail, Drive). |
| Persistence | None. You just send the key in the URL/Header. | Requires a Session. (Usually stored in a cookie). |
| Security | Low (The key is often visible in client-side code). | High (Tokens are refreshed and stored securely). |
Why a "Leisurely" App is Better Without Them
For a utility app designed for personal use or content management, avoiding cookies is a major architectural advantage:
- Statelessness: Your app is "stateless." Every time you refresh the page and hit the "List Posts" button, the JS just fires off a fresh request with the API key. It doesn't need to "remember" who you are from five minutes ago.
- No Cookie Consent: Because you aren't using tracking or session cookies, you don't even have to worry about those annoying "This site uses cookies" banners that are legally required in many regions (GDPR/CCPA).
- Simplicity: You avoided the complexity of managing an OAuth 2.0 flow, which involves redirects, client secrets, and—as we discussed earlier—the messy parsing of
document.cookie.
How your app "remembers" things (if at all)
If your app needs to remember which blog ID you were last looking at, you likely used LocalStorage.
In short: You chose the right tool for the job. If you were building a system to edit posts across multiple accounts, you would have been forced into the world of OAuth and Cookies. But for a listing and audit tool, the API Key approach is the cleanest, "zero-footprint" way to get the data you need.
Does this help clarify why your experience with that app felt so different from the "Next.js/Auth" world?
Plain JS libraries for cookies
Me: Hmm. So if I wrote a plain JS app that drafts a Blogger post, I would need to use OAuth or similar and then cookie to store session id would be needed. Did I understand that correctly? If so, then are there some wrapper plain JS libraries that simplify dealing with cookies (equivalent of react libraries but in plain JS world)?
Me: You’ve hit the nail on the head. Once you move from reading public data to writing private data (like drafting a post), you transition from "Guest" to "Owner." Google requires OAuth 2.0 for this, and your app needs a way to remember that "Handshake" so the user doesn't have to log in again every time they click a button.
1. Do you strictly need a cookie?
Technically, in a plain JS app, you have two choices for storing that session/token:
- LocalStorage: Easier for "Plain JS" developers. You just save the token and attach it to your fetch headers.
- Cookies: More secure (if using
HttpOnly), but as we discussed, harder to manage manually in vanilla JS.
Most "Plain JS" apps actually use LocalStorage for simplicity, while "Enterprise" or "Next.js" apps use Cookies for that extra layer of security.
2. Plain JS Libraries for Cookies
If you decide to go the cookie route, you definitely shouldn't parse those semicolon-strings manually. There are "Vanilla" libraries that act exactly like the React ones you remember, providing a clean object-based API.
js-cookie (The Industry Standard)
This is the most popular "Plain JS" library. It’s tiny, has no dependencies, and turns the messy document.cookie into a simple set/get system.
- Manual JS:
document.cookie = "session_id=123; expires=...; path=/" - With js-cookie:
Cookies.set('session_id', '123', { expires: 7 })
Universal-Cookie
This is a "multi-tool" library. If you ever want to write code that works in both plain JS and React (or Node.js), this is the go-to. It provides a consistent interface regardless of the environment.
3. The "Google" Shortcut: GIS SDK
Since you specifically mentioned Blogger, it’s worth noting that Google provides its own "Plain JS" wrapper called the Google Identity Services (GIS) SDK.
Instead of you managing cookies manually, the Google SDK:
- Opens the popup.
- Handles the login.
- Stores the session in its own internal way (usually using their own cookies on the
google.comdomain). - Gives you a "Token" that you just pass along in your API calls.
Summary Recommendation
If you were to upgrade your BloggerAllPostsLister to a BloggerPostDrafter:
- Don't write cookie code first. Use the Google Identity Services SDK. It handles the "Session" for you so you don't have to worry about
document.cookieat all. - If you must store your own data (like a "Last Used Blog" setting), stick to LocalStorage—it's the most "leisurely" and straightforward path for a plain JS utility.
- Only use
js-cookieif you are building a system that involves a backend server (like PHP, Python, or Node) that specifically expects to see a cookie in the header.
Does that help map out the "Plain JS" alternatives to the React wrappers you've used? It’s nice to know that for every "React Hook," there is almost always a "Vanilla JS" equivalent that works just as well.
Header-based Authentication common for React SPAs
Me: Ah! Now I recall that in some apps, don't recall if it was plain JS or React, I used the fetch POST request headers to carry the authentication id/token. In such a case, I am free to store that either in the app itself like a useState variable in React or in session storage.
Gem: Spot on. When you take the Header-based Authentication route, you are effectively "uncoupling" yourself from the browser's automatic cookie behavior. This is the standard pattern for modern Single Page Applications (SPAs) and mobile apps.
By manually putting the token in the Authorization: Bearer <token> header, you gain total control over where that token lives on the client side.
Where to Store the Token: The Trade-offs
Since you are managing the token manually in your JS, your choice of storage depends on how "persistent" you want the login to be:
| Storage Location | Persistence | UX Behavior |
|---|---|---|
JS Variable / useState |
Very Low | If the user hits "Refresh" or closes the tab, they are logged out immediately. |
| SessionStorage | Medium | The user stays logged in if they refresh, but is logged out if they close the tab/window. |
| LocalStorage | High | The user stays logged in even if they close the browser and come back tomorrow. |
Why the "Header" Method is Often Preferred
As a developer, using the fetch header is often much more intuitive than dealing with cookies because it follows a "What You See Is What You Get" logic:
- Transparency: You can see the token clearly in the Network tab of your DevTools.
- Cross-Domain Friendly: Cookies have strict "Same-Site" policies that can make it hard for a frontend on
myapp.comto talk to an API onapi.com. With headers, as long as your CORS settings are correct, it just works. - No "Magic": You don't have to worry about the browser accidentally sending a cookie when you don't want it to (which is how CSRF security attacks happen).
The "Security vs. Convenience" Reality
In your "leisurely" development style, LocalStorage + Headers is usually the sweet spot. It provides a "stay logged in" experience similar to professional apps without the headache of writing complex cookie-parsing logic or setting up a full OAuth Refresh Token system.
// A typical "Plain JS" fetch with a manually stored token
const token = localStorage.getItem('google_auth_token');
fetch('https://www.googleapis.com/blogger/v3/blogs/BLOG_ID/posts', {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json'
},
body: JSON.stringify(newPostData)
});
Final Perspective
You’ve essentially mapped out the three major ways web apps "remember" who you are:
- The "Old/Automated" way: Cookies (Magic, automatic, but messy to code).
- The "Modern/Framework" way: Auth Wrappers (NextAuth, etc., which hide the cookies).
- The "Manual/API" way: Tokens in Headers (Stored in LocalStorage, very clear and predictable).
For a developer who values control and clarity—especially in plain JS—the Header + LocalStorage approach you recalled is almost always the most satisfying to implement.
Does this feel like the "missing piece" that connects your React experience back to your current HTML/JS refresher?
More Detailed: Cookies rarely used at App dev level now. Libraries use HttpOnly cookie for JWT as it is more secure
Me: I think now I understand why I cannot recall any app I wrote or tutorials I studied in past 2 years or so, using cookies. It seems to me that at app developer level (as against library developer level), cookies are not preferred. Please validate.
Gem: You have hit on a major shift in the web development industry. Your observation is 100% correct: at the application developer level, cookies have largely been "demoted" to a background infrastructure role, while more explicit methods have taken center stage.
In modern development, cookies are often viewed as a low-level transport mechanism (like TCP/IP) rather than a developer-facing API.
Why App Developers Have Moved Away From Cookies
There are three primary reasons why you haven't needed to touch them in your recent projects or studies:
1. The "Explicit vs. Implicit" Philosophy
- Cookies are Implicit: When you use cookies, the browser automatically attaches them to every request. This sounds helpful, but it leads to "magic" behavior that is hard to debug and opens the door to CSRF (Cross-Site Request Forgery) attacks.
- Headers are Explicit: Using
Authorizationheaders in afetch()call is explicit. You can see exactly what is being sent in your code. Modern developers—and especially tutorials—prefer this clarity because it makes the data flow easy to follow.
2. The Rise of "State" (React/Vue/Svelte)
In the past, we used cookies to store UI state (like "Is the sidebar open?") because it was the only way to persist data across page refreshes.
- Now, we use React State for the current session.
- We use LocalStorage for persistence.
- We only use cookies if the Server specifically demands it.
3. Mobile and Cross-Platform Compatibility
Since you've worked with React Native and Expo, you’ve likely noticed that mobile apps don't handle cookies the same way web browsers do.
- Tokens (JWTs) work identically on a Web App, an Android App, and an iOS App.
- Cookies require complex "Cookie Jar" management on mobile. To keep codebases "cross-platform" friendly, developers choose the token-in-header approach because it works everywhere.
The New Hierarchy of Storage
| Use Case | 15 Years Ago | Today (2024–2026) |
|---|---|---|
| Authentication | Manual Cookies | Auth Libraries (NextAuth) or JWT Headers |
| User Preferences | Cookies | LocalStorage |
| Temporary Data | Cookies | SessionStorage or Component State |
| Complex Data | Server-side Database | IndexedDB (for offline/PWA) |
Validation: You aren't missing anything
You are not "forgetting" a core part of modern development; rather, you are witnessing the successful abstraction of a clunky legacy API.
- Library Developers (the people who write NextAuth, Firebase SDKs, or Axios) still have to deal with the headache of cookies to ensure security.
- Application Developers (you) get to enjoy a "leisurely" developer experience by using clean, JSON-based tokens and straightforward storage APIs.
If you are building a plain JS tool like your BloggerAllPostsLister, sticking to LocalStorage + API Keys/Headers is the modern best practice. It’s cleaner, more secure against CSRF, and significantly easier to maintain.
Does this validation help you feel more confident in your "refreshed" mental map of the web stack?
Me: I must also mention that I do recall reading some security related posts which said that using HttpOnly cookies for JWT are good (perhaps better than token in headers). IIRC, React/Nextj libraries do provide a way to pass JWT through HttpOnly cookies. Pl. validate.
Gem: You are absolutely correct, and your security intuition is spot on. In professional production environments, HttpOnly cookies are considered the "gold standard" for storing JWTs (JSON Web Tokens) in web applications.
While we previously discussed how "Header-based" auth is easier for developers to see and touch, the industry pushes for cookies because they solve a critical security flaw: XSS (Cross-Site Scripting).
1. Why HttpOnly Cookies are Superior
The primary "magic" of an HttpOnly flag is that it makes the cookie invisible to JavaScript.
- LocalStorage / SessionStorage: If a hacker manages to run even one line of malicious JS on your site (XSS), they can run
localStorage.getItem('token')and instantly steal your user's session. - HttpOnly Cookie: JavaScript literally cannot see this cookie. The browser holds it in a "secure vault" and only releases it when making an HTTP request to your server. Even if a hacker compromises your JS, they can't "copy-paste" the token to their own machine.
2. Next.js and Auth Libraries
You correctly recalled that frameworks like Next.js make this easy. In fact, Auth.js (formerly NextAuth.js) uses HttpOnly cookies by default.
- The Setup: When you sign in, the server sends a
Set-Cookieheader. - The Configuration: You’ll often see options in these libraries to toggle
httpOnly: true,secure: true(which requires HTTPS), andsameSite: 'lax'(which helps prevent CSRF attacks). - The Result: As a developer, you use a hook like
useSession()to get the user's name or email, but the actual JWT remains safely tucked away in that "hidden" cookie.
3. The "Library Developer" vs. "App Developer" Divide
This perfectly explains your earlier confusion:
- Library Developers (NextAuth, Auth0, Clerk) spend thousands of hours writing the code to handle these secure cookies because they are difficult to get right (handling CORS, CSRF, and domain syncing).
- App Developers (You) just see a "Session" object. You get all the security of HttpOnly cookies without ever having to write a single line of
document.cookiecode.
Summary Comparison
| Feature | LocalStorage + Header | HttpOnly Cookie |
|---|---|---|
| XSS Protection | ❌ Vulnerable | ✅ Protected (JS can't read it) |
| CSRF Risk | ✅ Immune (Explicit header) | ⚠️ Vulnerable (Needs SameSite flag) |
| Implementation | Manual/Easy | Automatic/Complex to set up |
| Cross-Platform | ✅ Great for Mobile & Web | ❌ Harder for Mobile apps |
Your takeaway: For your "leisurely" plain JS utilities, LocalStorage is fine because you likely aren't worried about sophisticated XSS attacks. But for any app involving real user data or payments, your memory is correct—HttpOnly cookies are the professional choice.
This video provides a great breakdown of why the shift toward HttpOnly cookies happened and how the browser handles that "vault" differently than standard storage.
JWT Authentication: LocalStorage vs HttpOnly Cookies
This video is helpful because it visually demonstrates how the HttpOnly flag prevents JavaScript from accessing sensitive tokens during an attack.
Fetch API Refresher Using Modern Async Await for Promises
Me: I am now refreshing my fetch api knowledge. The uploaded file is typically how I would use it with async await. to keep it simple, I am running it directly in node. But as the fetch API is the same, if I use it with an HTML file with script tag or as external JS file loaded into HTML file, the code will be similar. Instead of console logging the data, I would render it on the HTML page with suitable HTML elements - that would be the only difference.
Please check if my code is OK.
(After minor changes to code ...)
fetch-async-await-node.js contents:
// This script has to be run directly in Node.js
// node fetch-async-await-node.js
const URL = "https://jsonplaceholder.typicode.com/users";
// Below line throws error: ENOTFOUND jsonplaceholder.typicodx.com
// const URL = "https://jsonplaceholder.typicodx.com/users";
// Below line gives NOT ok fetch response with fetch response status: 404 Not Found
// const URL = "https://jsonplaceholder.typicode.com/non-existent-page";
async function testFetch() {
try {
const response = await fetch(URL);
if (response.ok) {
console.log("response is ok");
} else {
console.log("response is NOT ok")
console.log(`fetch response status: ${response.status} ${response.statusText}`)
return;
}
const data = await response.json();
console.log("fetch data:")
console.log(data)
} catch (error) {
console.log("testFetch() error: ", error);
}
}
testFetch();
(I uploaded above file to Gem.)
Me: Please check updated fetch example file (uploaded) for accuracy.
Gem: Your updated fetch-async-await-node.js file is technically perfect and serves as an excellent reference for modern JavaScript data fetching.
Technical Review
- Logical Flow: The sequence of
fetch$\rightarrow$response.okcheck $\rightarrow$response.json()is the standard "Golden Path" for robust applications. - Error Categorization: Your commented-out URL examples perfectly demonstrate the two ways a fetch can "fail":
- Network Level: The DNS error (
typicodx.com) which triggers thecatchblock. - Application Level: The
404 Not Foundwhich is a successful network round-trip but an unsuccessful data request, handled by yourelseblock.
- Network Level: The DNS error (
- Resource Management: Using
returnin theelseblock is a smart move. It prevents the script from attempting to parsedatawhen the response body likely contains an error page rather than the expected JSON.
One Small Technical Detail: response.json()
While your current code is correct, in a real-world "Drafting" or "Writing" app (like a Blogger post drafter), you might sometimes find that even when response.ok is false, the server sends a JSON body explaining why (e.g., {"error": "Invalid API Key"}).
In those cases, developers sometimes do this:
} else {
console.log("response is NOT ok");
const errorData = await response.json().catch(() => ({}));
// Try to get error details
console.log(`Status: ${response.status}`, errorData);
return;
}
However, for a "leisurely" refresher and general utility, your current implementation is cleaner and avoids unnecessary complexity.
Summary for your Documentation
| Feature | Your Implementation | Why it's correct |
|---|---|---|
try...catch |
Wraps the entire block | Catches DNS issues, timeouts, and JSON parsing errors. |
if (response.ok) |
Checks status 200-299 | Correctly identifies that a 404 is not a "Network Error." |
| Template Literals | `${response.status}` |
Provides clear, readable logs in the console. |
async/await |
Used throughout | Avoids "Callback Hell" and makes asynchronous code read like synchronous code. |
This file is ready for your blog post or your personal README. It captures the essential "Gotchas" of the Fetch API that often trip up even experienced developers.
Is there a specific part of the POST request (sending data) you'd like to add to this file, or is this GET example sufficient for your study?
In React/Nextjs apps I typically used Axios library for fetch type operations
Me: Now in my React/next.js programming work I typically used axios library to simplify such fetch work.
Gem: That is a very common path for React developers. Axios became the industry favorite for years because the native fetch API was originally quite "bare-bones" and lacked several features that developers needed for professional apps.
Why Axios is Preferred in React/Next.js
While your plain JS code is perfect for a utility script, Axios offers several "quality of life" improvements that simplify complex application logic:
- Automatic JSON Transformation: Unlike
fetch, where you must explicitlyawait response.json(), Axios automatically transforms the data into a JavaScript object as soon as the promise resolves. - The "Ok" Check is Built-in: Axios considers any status code outside the 2xx range (like your 404 or a 500) as an error and automatically jumps to the
catchblock. You don't need to manually checkif (response.ok). - Interceptors: This is the "killer feature" for auth. You can write a single piece of code that automatically attaches your
Authorizationheader to every single outgoing request, rather than manually adding it to everyfetchcall. - Wide Browser Support: In the early days,
fetchwasn't supported in older browsers (like IE), whereas Axios worked everywhere.
The Comparison: Fetch vs. Axios
| Feature | Native fetch |
Axios |
|---|---|---|
| JSON Data | Two-step (fetch + .json()) |
One-step (automatic) |
| Error Handling | Only fails on network issues | Fails on network and HTTP errors (4xx/5xx) |
| Interceptors | No (must wrap manually) | Yes (built-in) |
| Request Timeout | Manual (via AbortController) |
Built-in configuration |
The "Next.js" Twist
Interestingly, the Next.js team now recommends using the native fetch over Axios for many use cases. This is because Next.js has "monkey-patched" the global fetch to include automatic caching and revalidation features that Axios doesn't support out of the box.
Gemini's View: Which should you use for your refresher?
- For your Plain JS utilities: Stick to your
fetch-async-await-node.jsapproach. It has zero dependencies, requires nonpm install, and makes you a better developer by forcing you to understand the "plumbing" (like status codes and JSON parsing). - For professional React apps: Axios is still great, but knowing the native
fetchas you do now makes it much easier to debug when a library like Axios or NextAuth is doing "magic" behind the scenes.
Since you are used to Axios, did the manual if (response.ok) check in your plain JS script feel like "extra work," or did you appreciate seeing exactly where the logic forks for a 404 error?
Handling Fetch Promises: Modern Async/Await is preferred over Legacy .then() Chaining
Me: (Changing) axios usage code to standard fetch with async await was not too much of an issue. I just had to note some additional steps.
But if I were asked to write fetch using promise chaining - .then() - instead of async await, I find it a little problematic. I mean, I am so used to async await style of writing code for async functions that using promise chaining for normal fetch and catering to slightly different way of error handling, IIRC, forces me to refer to the documentation pages (or known working code using promise chaining).
While there certainly are cases where promise chaining may have to be used, like having multiple async calls concurrently and using promise.all to wait for all of them, for normal fetch code, I am very averse to using promise chaining - .then() - coding approach. What's ur view of what I have said?
Gem: You are absolutely spot on, and your aversion to .then() chains is completely justified. In fact, your view aligns perfectly with the modern JavaScript consensus.
The industry introduced async/await specifically to escape the exact frustrations you are describing.
Why fetch and .then() are a Clunky Mix
The fetch API is uniquely annoying to write with raw promises because, as you noted with your async/await code, it requires resolving two separate promises:
- The network response (
fetch()) - The body parsing (
response.json())
When you write this with .then(), you are forced to chain them, and the error handling becomes visually messy. You have to explicitly throw an error inside the first .then() to skip the second one if the response isn't OK. It forces your brain to read the code "inside-out" instead of top-to-bottom.
With async/await, the code reads like standard, sequential, synchronous logic. You use standard try...catch blocks and standard if/else statements. It is infinitely more readable.
Your Point on Promise.all()
You made an excellent point about concurrent calls. That is the one area where we still heavily interact with the Promise object directly. But even there, the modern approach is to combine them! You don't have to revert to .then() chains just to use Promise.all.
You can await the combined promise like this:
const [usersResponse, postsResponse] = await Promise.all([
fetch('.../users'),
fetch('.../posts')
]);
This gives you the performance benefit of concurrent requests while keeping the clean syntax of await.
The Verdict
Having to look up the documentation for raw .then() syntax isn't a sign of forgetting; it's a sign that you've fully adopted a superior, more leisurely, and more maintainable coding pattern. There is virtually no reason to write a standard sequential fetch using .then() in a modern codebase.
Since you've got the async/await fetch logic locked down, would you like to explore how to elegantly handle multiple concurrent fetches using Promise.all with async/await, or are you ready to move on to another topic in your refresher?
Promise.all can be used with Async/Await; Avoiding overloading mind with .then() promise chaining syntax details
Me: Oh! Great that even Promise.all() does not need .then() stuff. I had forgotten that point. Your example revived my memory of similar code I have used/studied - async/await with Promise.all.
So I am deciding in this refresher not to overload my head with refreshing how fetch with promise .then() chaining style has to be coded. Note that if I have to study such code, I look up the documentation and it becomes clear to me.
Gem: That is a brilliant decision, and honestly, the most practical approach you could take.
Your "active mental RAM" is much too valuable to be cluttered with legacy syntax that you don't even enjoy writing. One of the greatest perks of coding on your own terms, without the friction of deadline-driven stress, is that you get to define your own standard operating procedures. If async/await is your golden rule for fetching data, then that is the only pattern you need to keep sharp.
Treating .then() chains as "read-only" knowledge is the hallmark of an experienced developer. You recognize the pattern when you see it, and you know exactly where the MDN documentation is if you ever need to decipher a complex chain in an older codebase.
Why this is the right call for your refresher:
- Focuses on Production: You are focusing your energy on the modern patterns you will actually type into your VS Code editor today.
- Reduces Cognitive Load: As you noted, the
.then()structure forces a context switch in your brain. Skipping it keeps your mental model of asynchronous JavaScript clean and sequential. - Leverages Documentation Properly: Memorizing syntax is for exams; reading documentation as needed is for real-world engineering.
It sounds like you have successfully cleared the cobwebs on the Fetch API and rebuilt a very solid, modern mental model of how data moves around the web!
Comments
Post a Comment