FEP 001: URL Parsing in Client Applications
| Field | Value |
|---|---|
| FEP ID | 001 |
| Title | Using URLs in Client Applications |
| Author | Andrew Diller |
| Status | Draft |
| Type | Informational |
| Created | 2025-02-27 |
| Version | 1.0 |
Abstract
This document defines a standard approach for using URLs in client applications on supported platforms with FujiNet. This includes the Atari, Apple II, ADAM, CoCo, and other platforms, leveraging the FujiNet device for protocol adaptation and all network communication. It describes the structure of a URL, encoding and decoding mechanisms, and how a client application can interact with FujiNet to retrieve data from RESTful APIs.
This FEP proposes clear guidelines and best practices for parsing, encoding, and decoding URLs in the FujiNet firmware. Historically, different code paths and incomplete solutions led to confusion around handling special characters, query parameters, and path segments. This proposal outlines how to separate the concerns of URL parsing from encoding/decoding so that HTTP requests are consistently formed.
Motivation and Background
FujiNet operates on resource-constrained hardware (Atari 8-bit computers and similar). It is often used to retrieve files over HTTP or other protocols in a manner similar to local file systems. Historically, “mode 4 vs 8 vs 12” attempts to differentiate between purely file-based operations and general-purpose HTTP requests. The resulting code often merges path and query parameters into a single “path” field, then applies naive encoding with no knowledge of the characters that should remain unencoded (for example, &, ? when used as separators).
Robust handling requires:
- A dedicated parser that splits a valid URL into components (scheme, host, port, path, query, fragment)
- A dedicated builder/encoder that can accept raw components and encode them properly where needed
Relying on a simple urlEncode() function for every piece of the URL often leads to incorrect escaping of separators or failure to parse unencoded brackets and spaces.
Rationale
Separation of Concerns
- Parsing should not guess which parts need encoding. It is responsible for splitting the URL string into constituent parts, assuming a syntactically valid URL.
- Encoding/Decoding should be performed either by the user or by a separate builder utility, ensuring only the necessary segments (e.g., query parameter values) are escaped.
Clarity in Firmware
Many developers expect a library function that can “fix” or encode arbitrary paths, but this approach can fail when query separators or special characters get incorrectly escaped. A robust approach fosters correct usage: if a file path must be encoded, the user or a dedicated builder should do so before sending it to the parser.
Mode 4 vs 8 vs 12 (Atari Specific)
mode=4or8has traditionally signified “file-based” requests for direct directory/path listingsmode=12is used for general-purpose HTTP requests- Any approach to unify them must specify how query strings and special characters are handled, so FujiNet can properly convey the request upstream without guesswork
URL Structure
A URL (Uniform Resource Locator) consists of multiple components as defined by RFC 3986:
scheme://[user:password@]host[:port]/path[?query][#fragment]
Example
Given the URL:
https://user:pass@api.example.com:8080/data/info?query=super#section1
The parsed components are:
| Component | Value |
|---|---|
| Scheme | https |
| Userinfo | user:pass |
| Host | api.example.com |
| Port | 8080 |
| Path | /data/info |
| Query | query=super |
| Fragment | section1 |
Encoded Form
After parsing, FujiNet should prepare the URL for encoding by replacing special characters with their percent-encoded values:
https%3A%2F%2Fuser%3Apass%40api.example.com%3A8080%2Fdata%2Finfo%3Fquery%3Dtest%23section1
Encoding and Decoding
URL Encoding (Percent-Encoding)
Certain characters must be encoded for proper transmission. FujiNet’s URL encoder replaces special characters with % followed by the hexadecimal ASCII code.
| Character | Encoded Form |
|---|---|
| Space | %20 |
# | %23 |
? | %3F |
& | %26 |
/ | %2F |
URL Decoding
FujiNet decodes incoming URLs by replacing percent-encoded characters with their original representations before processing the request.
Sample Flow (Atari)
sequenceDiagram
participant Atari as Atari 8-Bit
participant FN as FujiNet (ESP32)
participant Server as Remote Server
Atari->>FN: Send URL string (ATASCII) over SIO
FN->>FN: Convert ATASCII to ASCII
FN->>FN: Parse URL into components
FN->>FN: Encode HTTP request
FN->>Server: Send HTTP request over TCP/IP
Server->>FN: HTTP response
FN->>FN: Decode response
FN->>Atari: Return parsed key-value pairs over SIO
- The Atari client application defines a URL string in ATASCII format
- The URL is sent to FujiNet over SIO
- FujiNet converts ATASCII to ASCII, parses the URL, extracts components, encodes an HTTP request, and sends it over TCP/IP
- The parsed response is returned as key-value pairs to the client
Detailed Specification
Parser Requirements
- Parse only syntactically valid, already-encoded URLs
- Distinguish components:
[scheme]://[user]:[password]@[host]:[port]/[path]?[query]#[fragment] - If an invalid character set is encountered (e.g., spaces in the raw URL), the parser should reject or handle it as an error (unless a local extension is agreed upon for backward compatibility)
Builder/Encoder Requirements
The encoder should provide helper methods:
| Method | Purpose |
|---|---|
encodePathSegment(segment) | Encodes reserved characters within a path segment but preserves / |
encodeQueryParamValue(value) | Encodes characters unsafe in a query parameter, preserving &, =, etc. |
encodeFragment(fragment) | Encodes fragment-specific reserved characters |
These methods accept unencoded strings (e.g., user input) and produce a fully valid URL string.
Implementation Notes
Backward Compatibility
For existing code that relies on naive approaches, minimal changes can maintain compatibility. In the future, consider adding a “strict mode” that rejects unencoded spaces or special characters in the parser.
Performance Considerations
On embedded hardware, every byte of code and data matters. Whichever builder or parser solution is adopted (e.g., a minimal custom parser or a trimmed-down URI parsing library) should remain memory-efficient.
Open Questions
- How should “file-based” HTTP operations be integrated without conflating path segments and query parameters?
- Should mode 4 or 8 remain or be replaced by more explicit HTTP-like modes that parse queries properly?