How Gutenberg Blocks Work

Posted on:

WordPress has been using the block editor for a few years now, and is finally starting to inch closer to full site editing. These upgrades are moving us inexorably toward block-based themes. Because of this, I think that anyone who's building in WordPress should get a solid grasp on how these things work. That's exactly what we're going to get into in this post. We're going to zoom way, way out and look at exactly what makes a Gutenberg block work.

How WordPress Content is Saved

Prior to the block editor, all sites ran on the tinyMCE editor (known as the classic editor now). When a post is saved, the HTML necessary to make that post content is written inside the wp_posts table under the post_content column. WordPress then uses the_content() to retrieve this HTML content from the database and display it on your page. Seems pretty straightforward, right?

WordPress has a very strong commitment to backwards compatibility, and because of this commitment, the block editor has to work in exactly the same way. That is - it has to save the HTML content directly in the post_content column so the_content() can output it. To do this, all blocks contain a save function inside the JavaScript that defines the block. This function actually tells WordPress what HTML to save when the post is saved. This function processes the block attributes, and generates the HTML WordPress compiles each block and saves it as a single HTML document in the content. Neat! We now have a way to create blocks, and save content.

How Block Data is Saved

But, hang on, if you convert blocks into a single HTML string, how on earth does it remember what blocks are on the page? How does it remember where these blocks are located? In what order? How does it remember what attributes were associated with a block when it was saved? Without all of this information, you could write a post with the block editor, but as soon as you left the editor the system would have no idea how to re-assemble the block editor when you return.

To resolve this problem, the block editor must store this data somewhere where it can be accessed, processed, and used when loading the post. This seems like a solid opportunity to save a serialized array in the database, where the_content would return an array of blocks so you can loop thorugh and render the content, but remember - WordPress loves backwards compatibility, and this would definitely break that.

Instead, the smarties contributing to WordPress introduced a syntax that gets embedded directly into the post content using HTML comments. This syntax wraps each block, and includes key information like the name of the block, as well as any attributes this block needs. This is then processed using some fancy regex to convert the content string into an array of blocks. The syntax looks a bit like this:

<!-- wp:paragraph -->
<p>So, to solve the problem without breaking backcompat, The smarties contributing to WordPress introduced a syntax that gets embedded directly into the post content using HTML comments. This syntax wraps each block, and includes key information like the name of the block, as well as any attributes this block needs. This is then processed using some fancy regex to convert the content string into an array of blocks. The syntax looks a bit like this:</p>
<!-- /wp:paragraph -->

Notice the HTML comments that wrap the actual output? That gets used to process these blocks, and re build the editor when the post content is loaded. This means the WordPress content houses most of the actual HTML output and all block data as well. This really tripped me up at first - I expected block data would be stored in postmeta, or something else, but no - it's directly in the content. And after working with it a bit, I think it makes a lot of sense.

How Content is Loaded

Since the block editor saves the content as static HTML, the process of loading this content is pretty similar to how it has always worked - the HTML is fetched from the database, it gets filtered through the_content, and then the resulting HTML is displayed. The key difference is there is few extra steps to parse the HTML content just before it is displayed on the site.

But what about blocks that are rendered server-side? Some blocks don't save their markup in the block editor. Instead, they save a set of attributes and rely on PHP to render the content. This means that when the content is saved, the block is literally an HTML comment containing a JSON of the attributes for the block and nothing else. How does WordPress actually render this output?

WordPress uses the_content to filter the content and transform these server-side rendered blocks into the HTML content much like how a shortcode works. After that, the block comments are stripped out before the content is rendered. This data doesn't need to be shown to render the page, and it could potentially be a security concern if this information is provided, so it's best to get rid of it.

How To Get Blocks From a WordPress Post

If you're working with server-side rendering, you don't have to worry about this because the callback includes the already-parsed block, but what happens if you want to do something else? Maybe you're using this block data in some really cool way, or perhaps you need to localize a script from the block data, or maybe you're only allowing certain people to see content inside a specific block. Either way, it is extremely helpful to have access to use-able block data. Luckily, WordPress makes it pretty darn easy.

To get a list of blocks from a post, you just need to run a function called parse_blocks. This nifty function will do fancy regex to transform the raw post block comments into an array of use-able block data. This includes any non-default post attributes, and any child blocks the block may have. All you need to feed it is the post content for your post.

Conclusion

As the block editor continues to blossom into a full site editing experience, you can expect to need to work with blocks more. Taking a little time to understand fundamentally how they work will go a long way in helping you understand how to approach block-related challenges in the future.