2023-08-30 18:01:40 +00:00
7 changed files with 39 additions and 39 deletions
--- a/content/blog/adding-search-to-ghost.md
+++ b/content/blog/adding-search-to-ghost.md
@ -30,7 +30,7 @@ So I decided to build a simple Python script that I can run as a cron job on the
 To provide the actual search function, I decided that I'd add a simple Python web server that operating on a separate port to the main site (actually port [9443](https://blakerain.com:9443/search/not+a+search+term)), which provides an API that the client-side JavaScript can call to get search results.
-### Populating the Database
+# Populating the Database
 The first step to developing the extractor script that would populate the search database was to get the shebang and imports out of the way. I knew that I wanted to have SQLite3, but I also needed the [requests](https://realpython.com/python-requests/) library to send HTTP requests to the Content API and the `os` module to allow me to pass the location of the database file and other settings as environment variables.
@ -139,7 +139,7 @@ With that out of the way I copied the Python script to the server, placing that
 Confident that everything would magically work I moved on to the search API.
-### Executing Search Queries
+# Executing Search Queries
 In order for the FTS3 table to be searched by some client-side JavaScript I decided to create another Python script that would use [Flask-RESTful](https://flask-restful.readthedocs.io/en/latest/) to provide an API. This API would accept a single search term, query the database, and then return any results as JSON. The client-side JavaScript could then use this JSON to render the search results.
@ -280,7 +280,7 @@ curl https://blakerain.com:9443/search/not+a+search
 []
 ```
-### Client-Side Search
+# Client-Side Search
 Now that the back-end of the search seems to be working okay (although I've not seen it bring through any results yet), I started out on the client side. I knew that I wanted two things:
@ -373,7 +373,7 @@ A couple of things I will note, however:
 1. Ghost lets you add some injection for specific pages, which is where I added some specific styling for the result HTML.
 1. Be aware that if the search API doesn't specify an `Access-Control-Allow-Origin` then the web browser will refuse to make the request, even though the domain is actually the same.
-### Conclusion
+# Conclusion
 In conclusion it seems that adding a separate search facility to Ghost was a lot easier than I was worried it might be. I had originally concerned myself with modifying Ghost itself (I've no idea what JAMstack is or how Ghost actually works). After seeing the other implementations I was inspired to take this approach, which seems to have worked quite well. The search is fairly fast, and will probably remain so for the foreseeable future.
@ -385,7 +385,7 @@ For now, you can find the Python scripts and the configuration files used on the
 There you will find the sources such as `simple-search.py`.
-### Future Improvements
+## Future Improvements
 There are a few things that I want to add to the search to improve it somewhat:
--- a/content/blog/allocating-memory-for-dma-in-linux.md
+++ b/content/blog/allocating-memory-for-dma-in-linux.md
@ -17,7 +17,7 @@ When writing a user-space program that shares memory with a hardware device, we
 To begin to understand this requires us to be notionally aware of the manner in which devices will access the memory that we share with them, and how to ask the OS to respect the physical location of the memory.
-### How Devices Can Access Memory
+# How Devices Can Access Memory
 These days, devices that are connected to a computer are typically connected via PCI Express (usually abbreviate to PCIe). Such devices will typically include support for accessing memory via DMA (Direct Memory Access).
@ -31,7 +31,7 @@ In this more recent model of PCI, the [North Bridge](<https://en.wikipedia.org/w
 When programming a device connected via PCIe, you will typically be writing a base address for a region of memory that you have prepared for the device to access. However, this memory cannot be allocated in the usual way. This is due to the way memory addresses are translated by the [MMU](https://en.wikipedia.org/wiki/Memory_management_unit) and the operating system – the memory that we traditionally allocate from the operating system is _virtual_.
-### Virtual and Physical Addresses
+# Virtual and Physical Addresses {#virt-and-phy-addresses}
 Typical memory allocation, such as when we use `malloc` or `new`, ultimately uses memory the operating system has reserved for our process. The address that we receive from the OS will be an address in the [virtual memory](https://en.wikipedia.org/wiki/Virtual_memory) maintained by the OS.
@ -48,7 +48,7 @@ It is important, therefore, that for any allocated memory we are able to obtain
 In order to address these two primary concerns we need to look to an alternative means of memory allocation than the traditional `malloc` and `new`. Moreover, as we are likely to need more than a standard page's worth of space (typically 4Kib), we need to allocate memory using larger pages of memory.
-### Establishing Physical Addresses
+# Establishing Physical Addresses {#physical-addresses}
 We understand that a process operates on virtual memory, and that memory is arranged in pages. The question now arises as to how we can establish the corresponding physical address for any given virtual address.
@ -130,7 +130,7 @@ static uintptr_t virtual_to_physical(const void *vaddr) {
 We can now use the `virtual_to_physical` function to ascertain the physical address of some memory that we allocate from the operating system. This is the address that we pass on to our hardware.
-### Linux Huge Pages
+# Linux Huge Pages {#hugepages}
 Now we know how to establish the physical address corresponding to a virtual address, the problem still remains that we need to obtain an address for _contiguous physical memory_, rather than merely the physical address of a single page. We are also still limited by the fact that the operating system may subject our memory to swapping and other operations.
@ -165,7 +165,7 @@ Something to note is that the kernel will try and balance the huge page pool ove
 Huge pages provide a rather nice solution to our problem of obtaining large contiguous regions of memory that are not going to be swapped out by the operating system.
-### Establishing Huge Page Availability
+# Establishing Huge Page Availability {#hugepage-availability}
 The first step towards allocating huge pages is to establish what huge pages are available to us. To do so we're going to query some files in the `/sys/kernel/mm/hugepages` directory. If any huge pages are configured, this directory will contain sub-directories for each huge page size:
@ -257,7 +257,7 @@ std::vector<HugePageInfo> HugePageInfo::load() {
 }
 ```
-### Allocating a Huge Page
+# Allocating a Huge Page {#allocating}
 Each huge page allocation is described by a `HugePage` structure. This structure encapsulates the virtual and physical address of an allocated huge page along with the size of the page in bytes.
@ -296,7 +296,7 @@ HugePage::Ref HugePageInfo::allocate() const {
 The value that we return from `allocate` constructs a `HugePage` with the virtual address that we received from `mmap`, the equivalent physical address as calculated by our `virtual_to_physical` function and the size of the huge page.
-### Deallocating a Hugepage
+# Deallocating a Hugepage {#deallocating}
 Once we no longer wish to retain a huge page we need to release it back into the huge page pool maintained by the operating system.
@ -309,7 +309,7 @@ HugePage::~HugePage() {
 }
 ```
-### Dividing Up a Hugepage into Buffers
+# Dividing Up a Hugepage into Buffers {#dividing-into-buffers}
 **Note:** _If you only wanted to know about the allocation of huge pages then you can skip to the [conclusion](#conclusion)._
@ -635,7 +635,7 @@ DMAPool::~DMAPool() {
 With the `DMAPool` implemented we can begin to portion out buffers of the required size and alignment to hardware. Hardware will require the physical address of each `Buffer` we allocate from the pool, which is available in the `Buffer::phy` field. Our process is also able to access this memory via the pointer in the `Buffer::address` field.
-### Conclusion
+# Conclusion {#conclusion}
 Preparing memory for use with DMA may seem a bit more complex than necessary. As developers we're often shielded from the details of memory management by useful abstractions such as those provided by `malloc` and `new`. This can mean that we are rarely exposed to the manner in which memory is managed by the operating system and our programs.
--- a/content/blog/bitmap-tri-color-marking.md
+++ b/content/blog/bitmap-tri-color-marking.md
@ -13,7 +13,7 @@ Recently I've been experimenting with various garbage collection implementations
 Before we get going with the details of this approach, I thought I'd set the scene a little with an overview of the garbage collection mechanism known as _tri-color marking_.
-### What is Tri-color Marking
+# What is Tri-color Marking
 Tri-color marking was first described I think by [Dijkstra et al](https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD520.html) as part of the garbage collector for a LISP system. This algorithm is used as an enhancement to a simpler mark-and-sweep approach.
@ -130,7 +130,7 @@ These two behaviours manifested in different ways during testing. The first feat
 The second behaviour – the number of passes are a function of the longest object chain – became quite apparent in tests that involved long chains of objects. Again, the JSON parser was a culprit of this behaviour of the marking process. If the GC executed whilst the parse tree was being referenced, multiple passes were required to walk all the nodes of the AST. Indeed this was quite likely: the JSON file was quite large, and the memory pressure increased quite drastically, often triggering multiple minor GC passes whilst the tree was being built.
-### Bitmap Marking
+# Bitmap Marking
 The part of the approach that stuck with me was the use of bitmaps to perform the tricolor marking process. To start this off, imagine we performed allocations within a set of _blocks._ Each block describes a region of memory that our allocator will meter out for each allocation request.
@ -177,7 +177,7 @@ We can see that this represents four allocations in a block of 16 cells. The use
 The last three fields of the `BlockInfo` structure are the white, grey and black sets. These bitmaps are not based on the size of a cell, but on the size of a pointer: each bit represents a region that is exactly the number of bytes in a pointer.
-### Populating the Bitmaps
+## Populating the Bitmaps
 The first step to performing our tri-color marking is to populate the white and grey bitmaps for every block. We maintain all our blocks in a `BlockSet` structure. This structure has `begin` and `end` methods that let us iterate over the blocks in the set.
@ -546,17 +546,17 @@ black bitmap TTTTTTTT TTTT---- TTTTTTTT ------TT
                                        XXXX
 ```
-### Lingering Thoughts
+# Lingering Thoughts
 I abandoned this approach to tri-color marking in it's current guise. The process of performing the actual marking was, for most of my tests, quite performant for my needs. However, I found that the GC it was implemented in had a number of significant performance issues. Most of these were due to the way I'd implemented the GC, rather than specifically with the bitmap-based approach to tri-color marking.
-#### Too Many Variables and Not Enough Rigour
+## Too Many Variables and Not Enough Rigour
 Fine-tuning all the variables in the GC didn't go well. There were quite a few variables, such as the size of each allocation cell in a page, the size of these pages, and so on. I never seemed to be able to balance these variables to provide a general configuration that was suitable for the range of workloads I anticipated.
 I'm sure that I could have tuned these variables by taking a more rigorous approach to the design and testing of the GC. Better yet, a smarter GC could have tuned itself to a certain extent based on how it was being used. More likely would be that I would never find a "best fit" set of parameters, but I might learn something along the way.
-#### Maintaining Remembered Sets
+## Maintaining Remembered Sets
 In order to be able to populate the white set with pointers in each block I decided to use smart pointers. This ended up being a terrible decision. The problem was exacerbated by these pointers being passed around all over the place. Turns out programs do this a lot. Who knew.
@ -572,7 +572,7 @@ I think that a more suitable approach would have been to simply stop the world a
 You know, like nearly every other GC does.
-#### Not Incremental or Concurrent
+## Not Incremental or Concurrent
 Because I was treating the tri-colour marking process as distinct from the allocator and mutator, the GC was constantly re-building white and grey bitmaps for every block, ever time it entered into a GC pass.
@ -584,7 +584,7 @@ The marking process as implemented did not lend itself to being concurrent. I di
 The problem was that the marking process synchronizes the blocks by their grey bitmaps in the `promote` function. When we promote a white pointer, we fill in the grey bitmap of the pointed to block. This means that we can end up filling in the grey bitmap of a block being processed by another thread. I did find a few alternatives to this, such as work queues and incoming grey bitmaps, but it really seemed to be a bit of a hopeless pursuit by that point.
-#### No Generations
+## No Generations
 I've saved what I felt was the the best for last: one of the biggest failings of this implementation was that there's no consideration of object generations.
@ -592,7 +592,7 @@ The generational hypothesis lends us a great advantage. If you've not heard of i
 The upshot of this is that objects which are retained beyond an initial one or two passes of the GC should be moved to a subsequent generation. These later generations can be collected with a lower frequency.
-### Conclusion
+# Conclusion
 I think that the bitmap based marking is a nice approach to tri-color, as the marking process is quite efficient. It requires virtually no memory allocation beyond a few bitmaps, and those can be allocated along with the block and reused for each pass. The main bottleneck ended up being the promotion of white pointers.
--- a/content/blog/moving-lambdas-to-rust.md
+++ b/content/blog/moving-lambdas-to-rust.md
@ -33,7 +33,7 @@ thumbnail: "https://opengraph.githubassets.com/e6c849253e37fbc1db7ae49d6368cc429
 icon: "https://github.com/fluidicon.png"
 ```
-## Site Analytics
+# Site Analytics
 A few months ago I decided to change the analytics for this website over to a custom analytics
 implementation, replacing my use of [Simple Analytics](https://simpleanalytics.com). The analytics
@ -54,7 +54,7 @@ point for a few reasons:
   use-case for a Lambda function.
 3. There's little pressure for these to be performant or stable, as it only effects this site 😆
-## Building Rust for AWS Lambda
+# Building Rust for AWS Lambda
 I initially had a number of issues compiling Rust code for AWS Lambda using the method described in the
 [README](https://github.com/awslabs/aws-lambda-rust-runtime#deployment) in the AWS Lambda Rust
@ -98,7 +98,7 @@ cp $(ldd "$EXE_PATH" | grep ssl | awk '{print $3}') "$OUTPUT_DIR/lib/"
 Now that my executables could be run by Lambda, I could start iterating the API and trigger
 functions.
-## Implementing the API
+# Implementing the API
 The initial implementation of the API in Rust has gone very easily, mostly due to the structures
 provided in various crates available to Rust, including the [lambda-http] crate. I was able to
@ -178,7 +178,7 @@ using `unwrap` and `expect` and allowing the Lambda function to panic.
 i32::from_str_radix(item["ViewCount"].as_n().unwrap(), 10).unwrap() // 😤
 ```
-## Implementing the Trigger
+# Implementing the Trigger
 Once I had come to understand the structures and functions in the Rust AWS client crates, I had a
 far easier time building the trigger function. This function simply responds to events received
@ -191,7 +191,7 @@ I was somewhat worried about the parsing of user agent strings: in Python I did
 [woothee](https://crates.io/crates/woothee) that performs the same operation just as well for my
 use case.
-## Conclusion
+# Conclusion
 I was pleasantly suprised at how well the process went. Apart from the somewhat slow start getting
 Rust code compiled for AWS Lambda on ARM, once I had my bearings it was quite easy going.
--- a/content/blog/moving-towards-jamstack-with-netlify.md
+++ b/content/blog/moving-towards-jamstack-with-netlify.md
@ -27,7 +27,7 @@ My goal was to remove the EC2 and RDS instances and change the structure of the
 1. Images would be stored in Amazon S3 by a custom storage adapter, and
 1. A static site is generated, and then hosted by [Netlify](https://www.netlify.com).
-### Ghost and Docker
+# Ghost and Docker
 I wanted to move the Ghost CMS from the EC2 instance into a Docker container on a local server at my home. To build this Docker container I used the [official Docker image](http://localhost:2368/p/754d8315-38fa-49ad-8ac1-62ffc1f02c2e/) as the base. I needed to add a [custom storage adapter](https://ghost.org/docs/config/#creating-a-custom-storage-adapter) that would make use of the AWS SDK to store images in S3. Therefore I needed to ensure that the [AWS SDK](https://www.npmjs.com/package/aws-sdk) was available in the image.
@ -37,7 +37,7 @@ Once I had the Ghost instance up and running, migrating the data from one instan
 There was one issue I had that ended up taking some time to remediate: the changeover of the storage adapter. Because I'd changed over to using S3 as the storage back-end, the URLs for the images in each of the blog posts was now incorrect. The first fix I considered was using SQL to find-and-replace all the URLs in the posts. However, in the end I opted for just editing each post and replacing the image. This is quite easy to do with the Ghost authoring tools. Moreover, this also gave me the opportunity to fix some of the screenshots.
-### Generating the Static Site
+# Generating the Static Site
 In order to render the site I decided to use React Static: a static site generator for React. I chose this approach over other [much easier options](https://ghost.org/docs/jamstack/) as I wanted to move away from Ghost themes – and I really enjoy using React :)
@ -55,7 +55,7 @@ icon: "https://github.com/fluidicon.png"
 I used the Ghost [Content API](https://ghost.org/docs/content-api/) to extract the navigation, posts, and pages. I then render them using React. The site is a very simple React application, with only a few components.
-### Deploying to Netlify
+# Deploying to Netlify
 Deploying the site to Netlify is as easy as using the [Netlify CLI](https://docs.netlify.com/cli/get-started/) on the command line after building the static site using React Static. All I required was a Netlify personal access token and the API ID of the site. Both of which can be easilly found in the Netlify interface.
@ -88,7 +88,7 @@ icon: "https://gist.github.com/fluidicon.png"
 The final piece of the puzzle was to connect Ghost to GitHub: when I make a change to the site I wanted the GitHub workflow to execute. As the GitHub API requires authentication, I created a small [lambda function](https://github.com/BlakeRain/blakerain.com/blob/main/lambda/ghost-post-actions/index.js). This function processes the POST request from the Ghost CMS [webhook](https://ghost.org/docs/webhooks/) and in turn makes a call to the GitHub API to trigger a [workflow dispatch event](https://docs.github.com/en/rest/reference/actions#create-a-workflow-dispatch-event).
-### Final Thoughts
+# Final Thoughts
 Now that I have a static version of the site, hosted for free at Netlify, I'm sure that I'll enjoy the cost saving (around $55 per month). Moreover the site loads significantly faster from the Netlify CDN than it did from the little EC2 instance. I feel much safer with the Ghost CMS administration interface running on a local server rather than it being exposed to the Internet.
--- a/content/blog/new-site-and-blog.md
+++ b/content/blog/new-site-and-blog.md
@ -19,7 +19,7 @@ So for this first post I wanted to share some information relating to how I set
 See here for adding search: [https://blakerain.com/blog/adding-search-to-ghost](http://localhost:2368/blog/adding-search-to-ghost)
-### Ghost CMS
+# Ghost CMS
 I've used [Ghost](https://ghost.org) to create this website. Ghost is a CMS, written in JavaScript, that provides a nice set of features without seeming to be too bloated.
@ -45,7 +45,7 @@ Apart from it's small size and not being built in PHP, some of the features that
 - I quite like working with JavaScript.
 - Ghost seemed easy to self-host, which is usually my preferred option.
-### Deploying Ghost
+# Deploying Ghost
 For the most part, installation of Ghost required following the instructions on the Ghost website. I roughly followed the guide for Ubuntu, as that is the distribution I chose:
@ -100,7 +100,7 @@ Finally I could make sure that the site was running using `ghost ls`, which gave
 ![](/content/new-site-and-blog/image-17.png)
-### Customizing Ghost
+# Customizing Ghost
 Once I had an installation that was working I wanted to be able to customize it. The first thing I wanted to do was to make sure that the site was not generally available. Conveniently Ghost includes a simple way of doing this by switching the site to private, disabling access, SEO and social features. This option can be found in the **General** settings of the administration portal:
--- a/content/blog/updated-site-search.md
+++ b/content/blog/updated-site-search.md
@ -25,7 +25,7 @@ The new search comprises two main components: a front-end interface and a back-e
 In this post I go into some detail of how the search is implemented. All the source code is available in the GitHub repository for this [blog](https://github.com/BlakeRain/blakerain.com).
-### Using a Prefix Tree
+# Using a Prefix Tree
 One of my goals for the new search is that it should be interactive, and quite fast. That means it must quickly give the user reasonably useful results. Moreover, as I wanted to simplify the implementation, I would like to maintain very few dependencies.
@ -85,7 +85,7 @@ Searching for occurrences from this node, the first leaf we reach is for the wor
 Building the results in this way allows us to quickly ascertain that words starting with the two letters `"be"` can be found in _Document 2_ primarily (there are six occurrences) and in _Document 1_, where we find one occurence.
-### Generating the Search Data
+# Generating the Search Data
 To build the prefix tree I decided to create a GitHub action. This action would be configured to be run at a certain interval (such as every hour) to regenerate the search data.
@ -119,7 +119,7 @@ Once the search data has been built and the file has been generated, the GitHub
 This data is then loaded and parsed by the search front-end.
-### Search Front-End
+# Search Front-End
 The search interface is a small amount of React code that is complied along with the customized Casper theme for the site. The interface loads and parses the search data from S3. For profiling it outputs a console message indicating how many posts and trie nodes were loaded from the search data, and the time it took:
@ -135,7 +135,7 @@ When a link is clicked in the search results, the page opened. The link contains
 ![](/content/updated-site-search/Selection_2056.png)
-### Conclusion
+# Conclusion
 I find this search implementation to be far simpler to maintain and use. We use a similar search system in our internal compliance management system at [Neo](https://neotechnologiesltd.com/). This removes the reliance on a secondary server that was solely used to service search queries. This leads to a cleaner approach that will also simplify moving the site to a CDN.