Skip to main content Link Search Menu Expand Document Toggle dark mode Copy Code (external link)

Scryber.Core Architecture Document

Table of Contents

  1. Overview
  2. System Architecture
  3. Project Structure
  4. Component Model
  5. PDF Generation Pipeline
  6. Subsystem Deep Dive
  7. Data Flow
  8. Design Patterns
  9. Extension Architecture
  10. Performance Considerations

Overview

Purpose

Scryber.Core is a .NET PDF generation engine that transforms HTML/XML templates with CSS styling into high-quality PDF documents. It bridges web technologies (HTML, CSS, JavaScript-like expressions) with PDF output, enabling developers to create complex documents using familiar web development patterns.

Key Design Goals

  • Web-First: Use HTML and CSS as primary authoring format
  • Data Binding: Support dynamic content through expression evaluation
  • Extensibility: Allow custom components, styles, and behaviors
  • Multi-Platform: Support .NET 6, 8, 9, and Standard 2.0
  • WASM Compatible: Run in Blazor WebAssembly environments
  • Performance: Efficient layout and rendering for large documents
  • Standards Compliance: Follow CSS box model and HTML semantics

Technology Stack

  • .NET Multi-targeting: net6.0, net8.0, net9.0, netstandard2.0
  • HTML Parsing: HtmlAgilityPack for loose HTML, System.Xml for strict XHTML
  • Image Processing: SixLabors.ImageSharp
  • Font Support: Custom OpenType parser (Scryber.Core.OpenType)
  • Expression Engine: Custom expression parser and evaluator
  • PDF Generation: Direct PDF structure writing (no dependencies on external PDF libraries)

System Architecture

High-Level Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Public API Layer                          β”‚
β”‚  Document.ParseDocument() / Document.ParseHtmlDocument()         β”‚
β”‚  Document.SaveAsPDF() / Document.SaveAsPDFAsync()                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Component Tree Layer                          β”‚
β”‚  HTMLDiv, HTMLSpan, Page, Section, Table, Image, etc.           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                   β”‚                   β”‚
        β–Ό                   β–Ό                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Styles    β”‚    β”‚   Binding    β”‚    β”‚   Layout     β”‚
β”‚  Subsystem   β”‚    β”‚  Subsystem   β”‚    β”‚  Subsystem   β”‚
β”‚              β”‚    β”‚              β”‚    β”‚              β”‚
β”‚ CSS Parser   β”‚    β”‚  Expression  β”‚    β”‚   Engine     β”‚
β”‚ Selectors    β”‚    β”‚  Evaluator   β”‚    β”‚   Managers   β”‚
β”‚ Cascading    β”‚    β”‚  Data Path   β”‚    β”‚   Measurers  β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                   β”‚                   β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚                      β”‚
                β–Ό                      β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚   Resources  β”‚      β”‚  PDF Writer  β”‚
        β”‚              β”‚      β”‚              β”‚
        β”‚  Fonts       β”‚      β”‚  Objects     β”‚
        β”‚  Images      β”‚      β”‚  Streams     β”‚
        β”‚  Shared      β”‚      β”‚  References  β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Layered Architecture

Layer 1: Foundation (Scryber.Common)

  • Core interfaces and contracts
  • PDF primitive types and structures
  • Resource management abstractions
  • Configuration and logging

Layer 2: Specialized Services

  • Drawing (Scryber.Drawing): Graphics primitives, fonts, colors, SVG
  • Expressions (Scryber.Expressive): Expression parsing and evaluation
  • Styles (Scryber.Styles): CSS parsing and style management
  • Imaging (Scryber.Imaging): Image loading and format conversion

Stem Layer: Object Graph Generation (Scryber.Generation)

  • HTML to XHTML Conversion
  • XML Namespace to Assembly Namespace mapping
  • Reflective parser for Object Type lookup
  • Graph construction and property assigment

Layer 3: Integration and Component Definition (Scryber.Components)

  • Component implementations
  • HTML element mapping
  • Layout engine
  • PDF generation orchestration

Layer 4: Framework Integration (Scryber.Components.Mvc)

  • ASP.NET MVC extensions
  • HTTP response integration

Significant Referenced Libraries

Nuget Packages

  • Scryber.Core.OpenType: Handles the reading of OpenType font files (.otf, .ttf, .otc, .otf), along with evaluating font properties and measuring strings.
  • SixLabours.ImageSharp: Handles evaluation and conversion of all supported image types to binary data (where needed).
  • HtmlAgilityPack: Supports the conversion of β€˜loose’ html to valid XHTML that can be processed by the reader
  • Newtonsoft.JSON: Supports the extraction of values from .json content, that has been decoded from a JSON string, or stream by the library.

Open source frameworks

  • bijington/expressive : The source has been modified significantly to work with the library, however forms the base of the Expression parser and function list.

Project Structure

Dependency Graph

Scryber.Common
    β”‚
    β”œβ”€β†’ Scryber.Drawing ────────┐
    β”‚       (fonts, graphics)    β”‚
    β”‚                            β”‚
    β”œβ”€β†’ Scryber.Expressive ───────
    β”‚       (expressions)        β”‚
    β”‚                            β–Ό
    β”œβ”€β†’ Scryber.Styles ←──── Scryber.Generation
    β”‚       (CSS)              (parsing, binding)
    β”‚                            β”‚
    β”œβ”€β†’ Scryber.Imaging ──────────
    β”‚       (images)             β”‚
    β”‚                            β”‚
    └─→ Scryber.Components β—„β”€β”€β”€β”€β”€β”˜
            (main engine)
                β”‚
                β–Ό
        Scryber.Components.Mvc
            (ASP.NET)

Project Responsibilities

Scryber.Common

Purpose: Foundation layer with core abstractions

Key Namespaces:

  • Scryber: Core interfaces (IComponent, IDocument, IBindableComponent)
  • Scryber.PDF: PDF primitive types (PDFString, PDFNumber, PDFDictionary)
  • Scryber.PDF.Native: Low-level PDF reading and writing
  • Scryber.PDF.Resources: Resource management (ISharedResource)
  • Scryber.Html: HTML entity definitions
  • Scryber.Logging: Trace and performance logging

Key Types:

  • IComponent: Base interface for all components with lifecycle methods
  • IPDFComponent: Components that can render to PDF
  • IResourceContainer: Manages document-level resources
  • PDFObjectRef: Indirect object references in PDF structure

Scryber.Drawing

Purpose: Graphics primitives and typography

Key Namespaces:

  • Scryber.Drawing: Core types (PDFColor, PDFUnit, PDFPoint, PDFRect)
  • Scryber.Drawing.Fonts: Font management and metrics
  • Scryber.Drawing.Svg: SVG path parsing and rendering
  • Scryber.PDF.Resources: Font resource generation

Key Types:

  • FontFactory: Creates and caches font instances
  • PDFFontResource: Manages font resources in PDF output
  • PDFSolidBrush, PDFSolidPen: Drawing styles
  • SVGPath: SVG path data parsing and rendering

Design Notes:

  • Embeds standard PDF fonts as resources (Helvetica, Times, Courier, etc.)
  • Uses Scryber.Core.OpenType for TrueType/OpenType parsing
  • Font metrics used for text measurement during layout

Scryber.Expressive

Purpose: Expression parsing and evaluation engine, based on (bijington/expressive)[https://github.com/bijington/expressive]

Key Namespaces:

  • Scryber.Expressive: Core expression types and parser
  • Scryber.Expressive.Expressions: Expression tree nodes
  • Scryber.Expressive.Functions: Built-in functions
  • Scryber.Expressive.Operators: Mathematical and logical operators

Key Types:

  • ExpressionParser: Tokenizes and parses expression strings
  • IExpression: Base interface for expression tree nodes
  • Context: Evaluation context with variables and functions
  • BinaryExpressionBase: Base for operator expressions
  • FunctionExpression: Function call expressions
  • VariableExpression: Variable lookup expressions

Expression Syntax:

Variables:    {{model.name}}
Properties:   {{model.user.firstName}}
Indexing:     {{model.items[0]}}
Math:         {{price * 1.2}}
Functions:    {{concat(firstName, ' ', lastName)}}
Conditionals: {{age >= 18 ? 'Adult' : 'Minor'}}

Built-in Functions:

  • concat(...): String concatenation
  • if(condition, true, false): Conditional evaluation
  • index(): Current iteration index in templates
  • length(array): Array/collection length
  • And more…

Scryber.Styles

Purpose: CSS parsing, selector matching, and style cascading

Key Namespaces:

  • Scryber.Styles: Style classes and definitions
  • Scryber.Styles.Parsing: CSS parser infrastructure
  • Scryber.Styles.Parsing.Typed: Individual CSS property parsers
  • Scryber.Styles.Selectors: Selector matching and specificity

Key Types:

  • CSSStyleParser: Main CSS parsing entry point
  • StylesDocument: Container for style collections (can be remote loaded)
  • CSSStyleItemReader: Tokenizes CSS content
  • Individual parsers: CSSBackgroundParser, CSSFontParser, CSSBorderParser, etc.
  • StyleMatcher: Matches selectors to components
  • StyleStack: Manages style inheritance and cascading

CSS Feature Support:

  • Selectors: element, class, ID, attribute, pseudo-classes (:before, :after)
  • Properties: Most CSS 2.1 properties plus common CSS3 features
  • Variables: var(--custom-property)
  • Calc: calc(100% - 20px) (partial support)
  • Counters: counter-reset, counter-increment, counter()
  • Content: content property for generated content

Design Pattern: Each CSS property has a dedicated typed parser class

  • Example: CSSFontParser handles font, font-family, font-size, etc.
  • Parsers convert CSS text values to typed style objects
  • Allows clean separation and easy extension

Scryber.Generation

Purpose: Reflective XML parsing and data binding infrastructure

Key Namespaces:

  • Scryber.Generation: Parser infrastructure and component creation
  • Scryber.Binding: Data binding and expression evaluation
  • Scryber.Binding.Expressions: Data path navigation

Key Types:

  • ParserDefintionFactory: Creates parser definitions for component types
  • ParserControllerDefinition: Defines controller attachments
  • BindingCalcExpressionFactory: Creates binding expressions from templates
  • BindingCalcParser: Integrates Expressive engine with template binding
  • ParserItemExpression: XPath-like data navigation

Binding Architecture:

Template: <div>{{model.user.name}}</div>
           β”‚
           β–Ό
BindingCalcParser.Parse("model.user.name")
           β”‚
           β–Ό
Creates Expression Tree
           β”‚
           β–Ό
DataBind phase evaluates with Context
           β”‚
           β–Ό
Result written to component property

Scryber.Imaging

Purpose: Image loading, decoding, and PDF formatting

Key Namespaces:

  • Scryber.Imaging: Factory infrastructure
  • Scryber.Imaging.Formatted: PDF image data formatters

Key Types:

  • ImageFactoryList: Manages registered image factories
  • ImageFactoryJpeg, ImageFactoryPng, etc.: Format-specific factories
  • PDFImageData: Abstract base for image data
  • PDFImageJpegData: JPEG passthrough (no re-encoding)
  • PDFImageSharpRGB24Data, PDFImageSharpRGBA32Data: Color format converters

Design Notes:

  • Uses SixLabors.ImageSharp for decoding
  • JPEG images passed through without re-encoding
  • Other formats converted to RGB24 or RGBA32 for PDF
  • Supports data URLs: data:image/png;base64,...

Scryber.Components

Purpose: Main PDF generation engine - orchestrates all subsystems

Key Namespaces:

  • Scryber.Components: Core component types
  • Scryber.Html.Components: 80+ HTML element implementations
  • Scryber.Html.Parsing: HTML parsing and component factory
  • Scryber.PDF.Layout: Layout engine and layout items
  • Scryber.PDF.Native: PDF generation and writing
  • Scryber.Components.Lists: List and list item components
  • Scryber.Components.Tables: Table, row, and cell components

Key Types:

  • Document: Root component and main public API
  • Page, PageBase, Section: Page-level components
  • HTMLParser: Parses HTML using HtmlAgilityPack
  • PDFLayoutDocument: Manages layout state
  • LayoutEngineDocument, LayoutEnginePage, etc.: Layout engines
  • PDFLayoutPage, PDFLayoutBlock, PDFLayoutLine: Layout items
  • PDFWriter: Low-level PDF structure writing

Component Hierarchy:

Component (abstract base)
    β”‚
    β”œβ”€β†’ VisualComponent
    β”‚       β”‚
    β”‚       β”œβ”€β†’ ContainerComponent
    β”‚       β”‚       β”‚
    β”‚       β”‚       β”œβ”€β†’ PageBase β†’ Page, Section
    β”‚       β”‚       β”œβ”€β†’ Panel β†’ Div, Span, Canvas
    β”‚       β”‚       β”œβ”€β†’ ListItem
    β”‚       β”‚       └─→ TableCell
    β”‚       β”‚
    β”‚       β”œβ”€β†’ Image
    β”‚       β”œβ”€β†’ LineBreak
    β”‚       └─→ Shape (line, rectangle, etc.)
    β”‚
    β”œβ”€β†’ TextLiteral (pure text)
    β”œβ”€β†’ StylesDocument (external CSS)
    └─→ Template (repeating content)

HTML Element Mapping (examples):

  • <div> β†’ HTMLDiv β†’ extends Div β†’ extends Panel
  • <span> β†’ HTMLSpan β†’ extends Span β†’ extends Panel
  • <table> β†’ HTMLTable β†’ extends TableGrid
  • <img> β†’ HTMLImage β†’ extends Image -> extends ImageBase
  • <p> β†’ HTMLParagraph β†’ extends Div

Component Model

Component Lifecycle

All components implement IComponent with these lifecycle phases:

1. Construction
   Component created via factory or constructor

2. Init(InitContext)
   - Register with document by ID
   - Initialize child components
   - Set up component relationships

3. Load(LoadContext)
   - Load external resources (async)
   - Images, fonts, CSS files
   - Process remote references

4. DataBind(DataContext)
   - Evaluate {{...}} expressions
   - Populate templates
   - Apply dynamic data

5. Layout (implicit during render)
   - Build applied styles
   - Measure component dimensions
   - Calculate positions
   - Handle page breaks
   - Create layout items

6. Render (implicit during SaveAsPDF)
   - Generate PDF structure
   - Write to output stream

7. Dispose()
   - Clean up resources
   - Release cached data

Component Responsibilities

Base Component:

  • Lifecycle management
  • Parent/child relationships
  • ID registration
  • Style association

Visual Component (extends Component):

  • Position and size
  • Margins, padding, borders
  • Background and fill
  • Visibility

Container Component (extends Visual):

  • Child management
  • Layout strategy
  • Content flow
  • Page breaking

Context Objects

Context objects thread through lifecycle phases without being stored in components:

InitContext:

  • Document reference
  • Trace logging
  • Performance tracking

LoadContext:

  • Async loading support
  • Resource cache
  • Base URL for relative paths

DataContext:

  • Data stack (scoped variables)
  • Expression evaluation
  • Template iteration

LayoutContext:

  • Current page
  • Available space
  • Font resources
  • Graphics state

RenderContext:

  • PDF writer
  • Resource registration
  • Current stream

PDF Generation Pipeline

Stage 1: Parsing

Input: HTML/XML string or file path Output: Component tree

Two Parser Paths:

  1. XML Parser (strict XHTML):
    • Uses System.Xml.XmlReader
    • Requires well-formed XML
    • Namespace-aware
    • Fast and memory-efficient
  2. HTML Parser (loose HTML):
    • Uses HtmlAgilityPack
    • Tolerates malformed HTML
    • Auto-closes tags
    • Slightly slower
    • Ultimately calls back to XHTML parser with sanitised content

Process:

HTML String
    β”‚
    β–Ό
HtmlDocument.Load() [HtmlAgilityPack]
    β”‚
    β–Ό
HTMLParser.Parse(HtmlNode)
    β”‚
    β”œβ”€β†’ HTMLParserComponentFactory.Create(tag name)
    β”‚       β”‚
    β”‚       └─→ Creates component instance
    β”‚
    └─→ Recursively parse children
            β”‚
            β–Ό
        Complete Component Tree

Component Creation:

  • Factory maintains dictionary: tag name β†’ component type
  • Example: "div" β†’ typeof(HTMLDiv)
  • Unknown tags create generic containers or are ignored
  • Attributes parsed and applied to component properties

Stage 2: Initialization

Input: Component tree Output: Registered and initialized components

Process:

Document.Init(InitContext)
    β”‚
    β”œβ”€β†’ Register component IDs
    β”‚       (enables ID-based lookups)
    β”‚
    β”œβ”€β†’ Initialize child components
    β”‚       (recursive)
    β”‚
    └─→ Set up resource containers
            (fonts, images)

Key Activities:

  • Components register themselves by ID with document
  • Parent-child relationships established
  • Style classes validated
  • Font families resolved to font definitions

Stage 3: Loading

Input: Initialized component tree Output: Tree with loaded external resources

Process:

Document.Load(LoadContext)
    β”‚
    β”œβ”€β†’ Load external CSS files
    β”‚       β”‚
    β”‚       └─→ StylesDocument.Load()
    β”‚               β”‚
    β”‚               └─→ HTTP GET (async)
    β”‚
    β”œβ”€β†’ Load external images
    β”‚       β”‚
    β”‚       └─→ Image.Load()
    β”‚               β”‚
    β”‚               β”œβ”€β†’ HTTP GET (async)
    β”‚               └─→ ImageFactory.Load()
    β”‚                       β”‚
    β”‚                       └─→ Decode to PDFImageData
    β”‚
    └─→ Load external fonts
            β”‚
            └─→ FontFactory.GetFont()
                    β”‚
                    β”œβ”€β†’ HTTP GET (async) for web fonts
                    └─→ Parse TrueType/OpenType

WASM Considerations:

  • All HTTP requests are async
  • No blocking I/O allowed
  • Can use DocumentTimerExecution to yield periodically
  • Resources cached to avoid redundant downloads

Stage 4: Data Binding

Input: Loaded component tree + data model Output: Tree with evaluated expressions

Process:

Document.DataBind(DataContext)
    β”‚
    └─→ For each component:
            β”‚
            β”œβ”€β†’ Evaluate {{...}} in attributes
            β”‚       β”‚
            β”‚       └─→ BindingCalcParser.Parse()
            β”‚               β”‚
            β”‚               └─→ ExpressionParser.Parse()
            β”‚                       β”‚
            β”‚                       └─→ Expression tree
            β”‚
            β”œβ”€β†’ Evaluate {{...}} in text content
            β”‚
            β”œβ”€β†’ Process templates (<template data-bind="...">)
            β”‚       β”‚
            β”‚       └─→ For each item in bound collection:
            β”‚               β”‚
            β”‚               β”œβ”€β†’ Clone template content
            β”‚               β”œβ”€β†’ Push item to data stack
            β”‚               └─→ DataBind clone
            β”‚
            └─→ Recursively bind children

Data Context Stack:

  • Context maintains stack of data scopes
  • model.name resolves from current scope
  • .name (dot prefix) means current item
  • Template iterations push new scope

Example:

<div>{{model.title}}</div>
<ul>
    <template data-bind="{{model.items}}">
        <li>{{.name}}</li>  <!-- . refers to current item -->
    </template>
</ul>

Stage 5: Style Resolution

Input: Data-bound component tree + CSS Output: Components with resolved styles

Process:

For each component:
    β”‚
    β”œβ”€β†’ Collect applicable styles:
    β”‚       β”‚
    β”‚       β”œβ”€β†’ Inline styles (highest priority)
    β”‚       β”œβ”€β†’ ID selector styles
    β”‚       β”œβ”€β†’ Class selector styles
    β”‚       β”œβ”€β†’ Element selector styles
    β”‚       └─→ Inherited styles
    β”‚
    β”œβ”€β†’ Calculate specificity
    β”‚       β”‚
    β”‚       └─→ Sort by: !important > inline > ID > class > element
    β”‚
    β”œβ”€β†’ Apply cascading
    β”‚       β”‚
    β”‚       └─→ Merge in specificity order
    |
    β”œβ”€β†’ Push 'Applied Style' on to currrent Style Stack
    β”‚
    β”œβ”€β†’ Resolve computed values
    β”‚       β”‚
    β”‚       β”œβ”€β†’ Inherit from parent where applicable
    β”‚       β”œβ”€β†’ Resolve relative units (%, em)
    β”‚       β”œβ”€β†’ Evaluate var() and calc()
    β”‚       └─→ Build *complete style* from direct and inherited values.
    β”‚
    β”‚    (continue to use the style and process children)
    β”‚
    └─→ Pop 'Applied Style' from the stack once processed

Style Inheritance:

Inheritance is controlled by the individual StyleKeys and the Items they belong to.

  • Font properties inherit by default
  • Box model properties don’t inherit

Stage 6: Layout

Input: Styled component tree Output: PDFLayoutDocument with positioned items

Core Concept: Two-stage rendering separates measurement from output

Layout Process:

Document.RenderToPDF(context)
    β”‚
    └─→ LayoutEngineDocument.Layout()
            β”‚
            β”œβ”€β†’ For each page component:
            β”‚       β”‚
            β”‚       └─→ LayoutEnginePage.Layout()
            β”‚               β”‚
            β”‚               β”œβ”€β†’ Measure header
            β”‚               β”œβ”€β†’ Measure footer
            β”‚               β”œβ”€β†’ Calculate content area
            β”‚               β”‚
            β”‚               └─→ Layout content:
            β”‚                       β”‚
            β”‚                       └─→ LayoutEnginePanel.Layout()
            β”‚                               β”‚
            β”‚                               β”œβ”€β†’ For each child:
            β”‚                               β”‚       β”‚
            β”‚                               β”‚       β”œβ”€β†’ Measure required space
            β”‚                               β”‚       β”œβ”€β†’ Apply positioning
            β”‚                               β”‚       └─→ Layout child content
            β”‚                               β”‚
            β”‚                               └─→ Handle page breaks:
            β”‚                                       β”‚
            β”‚                                       β”œβ”€β†’ If content overflow
            β”‚                                       β”œβ”€β†’ Create continuation
            β”‚                                       └─→ Split content across pages
            β”‚
            └─→ Returns PDFLayoutDocument

Layout Engines (by component type):

  • LayoutEngineDocument: Top-level coordinator
  • LayoutEnginePage: Page layout with header/footer
  • LayoutEnginePanel: Block and inline flow
  • LayoutEngineTable: Table layout with colspan/rowspan
  • LayoutEngineList: Numbered and bulleted lists
  • LayoutEngineText: Text flow and line breaking

Text Layout:

LayoutEngineText.Layout(available width)
    β”‚
    β”œβ”€β†’ Split text into words
    β”‚
    β”œβ”€β†’ For each word:
    β”‚       β”‚
    β”‚       β”œβ”€β†’ Measure word width with font metrics
    β”‚       β”‚
    β”‚       β”œβ”€β†’ If fits on current line:
    β”‚       β”‚       └─→ Add to line
    β”‚       β”‚
    β”‚       └─→ If doesn't fit:
    β”‚               β”œβ”€β†’ Apply hyphenation if enabled
    β”‚               β”œβ”€β†’ Break line
    β”‚               └─→ Continue on next line
    β”‚
    └─→ Create PDFLayoutLine items

Layout Items (output structure):

PDFLayoutDocument
    β”‚
    └─→ Pages: List<PDFLayoutPage>
            β”‚
            β”œβ”€β†’ HeaderBlock: PDFLayoutBlock
            β”œβ”€β†’ FooterBlock: PDFLayoutBlock
            β”‚
            └─→ ContentBlock: PDFLayoutBlock
                    β”‚
                    └─→ Columns: List<PDFLayoutRegion>
                            β”‚
                            └─→ Contents: List<PDFLayoutItem>
                                    β”‚
                                    β”œβ”€β†’ PDFLayoutBlock (container)
                                    β”œβ”€β†’ PDFLayoutLine (text)
                                    β”œβ”€β†’ PDFLayoutRun (content)
                                    └─→ PDFLayoutImage, etc.

Box Model:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Margin (transparent)                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Border                        β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚ Padding                 β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚ Content           β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚                   β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚                         β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β”‚                               β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Positioning Modes:

  1. Block Flow (default for divs):
    • Stacks vertically
    • Takes full width
    • Respects margins
  2. Inline Flow (default for spans):
    • Flows horizontally
    • Wraps at container edge
    • Vertical alignment
  3. Relative Positioning:
    • Offset from normal position
    • Space still reserved in flow
  4. Absolute Positioning:
    • Removed from flow
    • Positioned relative to container
    • No space reserved
  5. Float (left/right):
    • Removed from flow
    • Content wraps around
    • Cleared with clear property

Stage 7: Rendering

  • Input: PDFLayoutDocument
  • Output: PDF byte stream
  • Co-ordinator: PDFWriter
  • Helper: PDFGraphics

The PDFWriter handles the output of content to the underlying stream, using a stack approach of PDFIndirectObjects. The PDFLayoutItems, and associated PDFXOjects and PDFResources begin β€˜objects’, and instruct the writer what to output. Once an indrirect object is completed and closed, then (and only then) are it’s contents written to the base stream and it will be popped from the stack.

Any further writing will then continue on the previous object in the stack.

The PDFGraphics instance acts as a higher level stream writer, to render the instructions for drawing rectangles, set up fonts and rendering characters in a way that PDF Readers understand.

Process:


PDFLayoutDocument.OutputToPDF(PDFRenderContext, PDFWriter)
    β”‚
    β”œβ”€β†’ Write PDF header
    β”‚       (%PDF-1.4 or later)
    β”‚
    β”œβ”€β†’ Write document catalog
    β”‚       β”‚
    β”‚       β”œβ”€β†’ Pages tree root
    β”‚       β”œβ”€β†’ Outlines (bookmarks)
    β”‚       β”œβ”€β†’ Named destinations
    β”‚       └─→ Metadata
    β”‚
    β”œβ”€β†’ For each layout page:
    β”‚       β”‚
    β”‚       β”œβ”€β†’ Create page object
    β”‚       β”œβ”€β†’ Create content stream
    β”‚       β”‚       β”‚
    β”‚       β”‚       └─→ Write drawing commands:
    β”‚       β”‚               β”‚
    β”‚       β”‚               β”œβ”€β†’ Set graphics state
    β”‚       β”‚               β”œβ”€β†’ Draw backgrounds
    β”‚       β”‚               β”œβ”€β†’ Draw borders
    β”‚       β”‚               β”œβ”€β†’ Draw text
    β”‚       β”‚               β”œβ”€β†’ Draw images
    β”‚       β”‚               └─→ Draw shapes
    β”‚       β”‚
    β”‚       └─→ Register page resources
    β”‚
    β”œβ”€β†’ Write font resources
    β”‚       β”‚
    β”‚       └─→ Font descriptors + font programs
    β”‚
    β”œβ”€β†’ Write image resources
    β”‚       β”‚
    β”‚       └─→ Image XObjects with data
    β”‚
    β”œβ”€β†’ Write cross-reference table
    β”‚       (byte offsets of all objects)
    β”‚
    └─→ Write trailer
            (points to catalog and xref)

PDF Structure (simplified):

%PDF-1.4
1 0 obj          % Document Catalog
  << /Type /Catalog
     /Pages 2 0 R >>
endobj

2 0 obj          % Pages Tree
  << /Type /Pages
     /Kids [3 0 R]
     /Count 1 >>
endobj

3 0 obj          % Page 1
  << /Type /Page
     /Parent 2 0 R
     /Contents 4 0 R
     /Resources << /Font << /F1 5 0 R >>
                   /XObject << /Im1 6 0 R >> >> >>
endobj

4 0 obj          % Page Content Stream
  << /Length 123 >>
stream
BT
/F1 12 Tf
100 700 Td
(Hello World) Tj
ET
endstream
endobj

5 0 obj          % Font Resource
  << /Type /Font
     /Subtype /TrueType
     ... >>
endobj

6 0 obj          % Image Resource
  << /Type /XObject
     /Subtype /Image
     /Width 100
     /Height 100
     ... >>
endobj

xref             % Cross-reference table
0 7
0000000000 65535 f
0000000009 00000 n
...

trailer
<< /Size 7
   /Root 1 0 R >>
startxref
1234
%%EOF

Subsystem Deep Dive

CSS Parser Architecture

Entry Point: CSSStyleParser.ParseCSS(string cssContent)

Parse Flow:

CSS String
    β”‚
    β–Ό
CSSStyleItemReader (tokenizer)
    β”‚
    β”œβ”€β†’ Remove /* comments */
    β”œβ”€β†’ Identify selectors
    └─→ Extract property blocks
        β”‚
        β–Ό
For each selector + properties:
    β”‚
    β”œβ”€β†’ Create Style object
    β”‚
    └─→ For each property:
            β”‚
            └─→ Route to typed parser:
                    β”‚
                    β”œβ”€β†’ CSSFontParser for font-*
                    β”œβ”€β†’ CSSColorParser for color, background-color
                    β”œβ”€β†’ CSSBorderParser for border-*
                    β”œβ”€β†’ CSSPaddingParser for padding-*
                    └─→ etc.
                        β”‚
                        └─→ Parse value and set on Style

Example CSS Property Parser:

// CSSFontParser.cs
public class CSSFontParser : CSSStyleAttributeParser<FontStyle>
{
    protected override bool DoSetStyleValue(Style style,
                                           CSSStyleItemReader reader,
                                           PDFContextBase context)
    {
        string value = reader.CurrentTextValue;

        if(property == "font-family")
        {
            style.Font.FontFamily = ParseFontFamily(value);
        }
        else if(property == "font-size")
        {
            style.Font.FontSize = ParseUnit(value);
        }
        // ... more properties

        return true;
    }
}

CSS Specificity Calculation:

Inline styles:      1000 points
ID selectors:       100 points
Class selectors:    10 points
Element selectors:  1 point

Example:
  div.header           β†’ 1 + 10 = 11
  #main                β†’ 100
  div#main.header      β†’ 1 + 100 + 10 = 111
  style="..."          β†’ 1000

Expression Engine Architecture

Expression Grammar:

expression  := term (('+' | '-') term)*
term        := factor (('*' | '/') factor)*
factor      := number | string | variable | function | '(' expression ')'
variable    := identifier ('.' identifier | '[' expression ']')*
function    := identifier '(' arguments ')'
arguments   := expression (',' expression)*

Parse Example:

Input: "{{(model.price * 1.2)}}"

Tokenize:
  OPEN_BRACE
  IDENTIFIER("model")
  DOT
  IDENTIFIER("price")
  MULTIPLY
  NUMBER(1.2)
  CLOSE_BRACE

Parse to Expression Tree:
  BinaryExpression(*)
    β”œβ”€β†’ PropertyExpression
    β”‚     β”œβ”€β†’ VariableExpression("model")
    β”‚     └─→ Property("price")
    └─→ ConstantExpression(1.2)

Evaluate with Context:
  context["model"] = { price: 10.0 }

  PropertyExpression.Evaluate()
    β†’ model.price β†’ 10.0

  BinaryExpression.Evaluate()
    β†’ 10.0 * 1.2 β†’ 12.0

Result: 12.0

Function Evaluation:

// Built-in function: concat
public class ConcatFunction : IFunction
{
    public string Name => "concat";

    public object Evaluate(IExpression[] arguments, Context context)
    {
        var values = arguments.Select(a => a.Evaluate(context));
        return string.Concat(values);
    }
}

// Usage in template:
{{concat(user.firstName, ' ', user.lastName)}}

Layout Engine Architecture

Layout Engine Selection:


// Panel.cs - implements IPDFViewPortComponents so has a GetEngine() function

public IPDFLayoutEngine GetEngine(Component component)
{
    if(component is PageBase)
        return new LayoutEnginePage();
    else if(component is TableGrid)
        return new LayoutEngineTable();
    else if(component is ListOrdered || component is ListUnordered)
        return new LayoutEngineList();
    else if(component is Panel)
        return new LayoutEnginePanel();
    else if(component is Label || component is TextLiteral)
        return new LayoutEngineText();
    // ... more types
}

Layout Algorithm (simplified):

NOTE: Most layout engines inherit from Scryber.PDF.Layout.LayoutEngineBase (or the higher level LayoutEnginePanel) and override the methods they are interested in altering. Implemnting a complete layout engine is an arduous task.

// LayoutEnginePanel.cs
public override void DoLayoutComponent()
{
    Panel panel = (Panel)this.Component;
    PDFLayoutContext context = this.Context;

    PDFRect availableSpace = context.Space;
    // Create block for this panel
    PDFLayoutBlock block = new PDFLayoutBlock(this.FullStyle);
    block.Position = availableSpace.Location;

    var container = context.CurrentPage.LastAavailableBlock();

    if(container.AvailableSpace < block.RequiredSize)
        if(!this.BeginNewRegion()) return;
    

    foreach(var child in panel.Children)
    {
        // Get layout engine for child
        var childEngine = GetEngine(child);

        // Calculate available space for child
        PDFRect childSpace = new PDFRect(
            x: availableSpace.X + child.Margins.Left,
            y: currentY,
            width: availableSpace.Width - child.Margins.Horizontal,
            height: availableSpace.Height - currentY
        );

        // Layout child
        context.Space = childSpace;
        childEngine.Layout(context, child);

        // Get child's layout block
        PDFLayoutBlock childBlock = context.DocumentLayout.CurrentPage
                                          .LastBlock;

        // Move Y position down
        currentY += childBlock.Height + child.Margins.Bottom;

        // Check for page break
        if(currentY > availableSpace.Height)
        {
            // Create new page
            context.DocumentLayout.AddPage();
            currentY = 0;
        }

        // Add child block to panel block
        block.Add(childBlock);
    }

    block.Height = currentY;
    context.DocumentLayout.CurrentPage.Add(block);
}

Text Line Breaking:

// LayoutEngineText.cs (simplified)
public void LayoutTextLine(string text, PDFUnit availableWidth,
                          PDFFont font, PDFUnit fontSize)
{
    List<string> words = SplitIntoWords(text);
    PDFLayoutLine currentLine = new PDFLayoutLine();
    PDFUnit currentWidth = 0;

    foreach(string word in words)
    {
        PDFUnit wordWidth = MeasureWord(word, font, fontSize);

        if(currentWidth + wordWidth <= availableWidth)
        {
            // Word fits on current line
            currentLine.Add(new PDFTextRun(word));
            currentWidth += wordWidth;
        }
        else
        {
            // Word doesn't fit - check hyphenation
            if(EnableHyphenation && wordWidth > availableWidth * 0.5)
            {
                var parts = HyphenateWord(word);
                currentLine.Add(new PDFTextRun(parts.First + "-"));
                FinishLine(currentLine);

                // Continue with remaining part
                currentLine = new PDFLayoutLine();
                currentLine.Add(new PDFTextRun(parts.Second));
                currentWidth = MeasureWord(parts.Second, font, fontSize);
            }
            else
            {
                // Break line and continue
                FinishLine(currentLine);
                currentLine = new PDFLayoutLine();
                currentLine.Add(new PDFTextRun(word));
                currentWidth = wordWidth;
            }
        }
    }

    FinishLine(currentLine);
}

Resource Management

Resource Lifecycle:

1. Request Resource
   Component needs font/image during Load or Layout

2. Check Cache
   IResourceContainer.TryGetResource(key)

3. If Not Cached:
   β”œβ”€β†’ Load resource (file, HTTP, embedded)
   β”œβ”€β†’ Create resource object (PDFFontResource, PDFImageXObject)
   β”œβ”€β†’ Register with document: IResourceContainer.AddResource(key, resource)
   └─→ Return resource

4. If Cached:
   └─→ Return cached resource

5. During Render:
   β”œβ”€β†’ Resource registers itself in page resources dictionary
   └─→ PDF writer outputs resource object once

6. References:
   Multiple components reference same resource by name
   (e.g., /F1 for font, /Im1 for image)

Example: Font Resource:

Component A needs Helvetica 12pt
    β”‚
    └─→ FontFactory.GetFont("Helvetica", 12)
            β”‚
            β”œβ”€β†’ Check cache: "Helvetica-12"
            β”‚       Not found
            β”‚
            β”œβ”€β†’ Load Helvetica font definition
            β”œβ”€β†’ Create PDFFontResource
            β”œβ”€β†’ Cache: fonts["Helvetica-12"] = resource
            └─→ Return resource

Component B needs Helvetica 12pt
    β”‚
    └─→ FontFactory.GetFont("Helvetica", 12)
            β”‚
            └─→ Check cache: "Helvetica-12"
                    Found β†’ Return cached resource

During Render:
    Page 1 renders Component A
        └─→ References font as /F1

    Page 2 renders Component B
        └─→ References same font as /F1

    Font written to PDF once:
        5 0 obj
          << /Type /Font
             /Subtype /TrueType
             /BaseFont /Helvetica
             ... >>
        endobj

Data Flow

Complete Example: HTML to PDF

Input HTML:

<!DOCTYPE html>
<html>
<head>
    <style>
        .header {
            font-size: 24pt;
            color: #336699;
            margin-bottom: 20pt;
        }
        .item {
            margin: 10pt;
            padding: 5pt;
            border: 1pt solid black;
        }
    </style>
</head>
<body>
    <div class="header">{{model.title}}</div>
    <div>
        <template data-bind="{{model.items}}">
            <div class="item">{{.name}}: ${{.price}}</div>
        </template>
    </div>
</body>
</html>

Data Model:

var model = new {
    title = "Product List",
    items = new[] {
        new { name = "Widget", price = 10.00 },
        new { name = "Gadget", price = 25.50 }
    }
};

Data Flow:

1. Parse (HTML β†’ Components):

HTMLParser.Parse(html)
    β”‚
    β”œβ”€β†’ HTMLBody
    β”‚     β”‚
    β”‚     β”œβ”€β†’ HTMLDiv (class="header")
    β”‚     β”‚     └─→ TextLiteral("{{model.title}}")
    β”‚     β”‚
    β”‚     └─→ HTMLDiv
    β”‚           └─→ HTMLTemplate (data-bind="{{model.items}}")
    β”‚                 └─→ HTMLDiv (class="item")
    β”‚                       └─→ TextLiteral("{{.name}}: ${{.price}}")

2. Init (Register components):

Document.Init(context)
    └─→ Each component initializes
        (No visual change, just setup)

3. Load (External resources):

Document.Load(context)
    β”‚
    └─→ StylesDocument.Load()
            β”‚
            β”œβ”€β†’ CSSStyleParser.Parse(<style> content)
            β”‚       β”‚
            β”‚       β”œβ”€β†’ Parse .header { font-size: 24pt; ... }
            β”‚       β”‚     └─→ Create Style with specificity 10
            β”‚       β”‚
            β”‚       └─→ Parse .item { margin: 10pt; ... }
            β”‚             └─→ Create Style with specificity 10
            β”‚
            └─→ Register styles with document

4. DataBind (Evaluate expressions):

Document.DataBind(context)
    β”‚
    β”œβ”€β†’ HTMLDiv (class="header")
    β”‚     β”‚
    β”‚     └─→ TextLiteral
    β”‚           β”‚
    β”‚           β”œβ”€β†’ Parse "{{model.title}}"
    β”‚           β”œβ”€β†’ Evaluate with context (model.title = "Product List")
    β”‚           └─→ Set text = "Product List"
    β”‚
    └─→ HTMLTemplate (data-bind="{{model.items}}")
            β”‚
            β”œβ”€β†’ Parse "{{model.items}}"
            β”œβ”€β†’ Evaluate β†’ returns array with 2 items
            β”‚
            └─→ For each item:
                    β”‚
                    β”œβ”€β†’ Clone template content (HTMLDiv with TextLiteral)
                    β”œβ”€β†’ Push item to data stack
                    β”‚
                    β”œβ”€β†’ DataBind clone:
                    β”‚     β”‚
                    β”‚     └─→ Parse "{{.name}}: ${{.price}}"
                    β”‚           β”‚
                    β”‚           β”œβ”€β†’ .name evaluates to "Widget" (1st) / "Gadget" (2nd)
                    β”‚           β”œβ”€β†’ .price evaluates to 10.00 / 25.50
                    β”‚           └─→ Result: "Widget: $10.00" / "Gadget: $25.50"
                    β”‚
                    └─→ Add clone to parent

Result Component Tree:
    HTMLBody
      β”œβ”€β†’ HTMLDiv (class="header")
      β”‚     └─→ TextLiteral("Product List")
      β”‚
      └─→ HTMLDiv
            β”œβ”€β†’ HTMLDiv (class="item")
            β”‚     └─→ TextLiteral("Widget: $10.00")
            β”‚
            └─→ HTMLDiv (class="item")
                  └─→ TextLiteral("Gadget: $25.50")

5. Style Resolution:

For HTMLDiv (class="header"):
    β”‚
    β”œβ”€β†’ Match selectors:
    β”‚     └─→ .header (specificity: 10)
    β”‚
    β”œβ”€β†’ Apply styles:
    β”‚     β”œβ”€β†’ font-size: 24pt
    β”‚     β”œβ”€β†’ color: #336699
    β”‚     └─→ margin-bottom: 20pt
    β”‚
    └─→ Store computed style

For each HTMLDiv (class="item"):
    β”‚
    β”œβ”€β†’ Match selectors:
    β”‚     └─→ .item (specificity: 10)
    β”‚
    β”œβ”€β†’ Apply styles:
    β”‚     β”œβ”€β†’ margin: 10pt
    β”‚     β”œβ”€β†’ padding: 5pt
    β”‚     └─→ border: 1pt solid black
    β”‚
    └─→ Store computed style

6. Layout:

LayoutEngineDocument.Layout()
    β”‚
    └─→ LayoutEnginePage.Layout()
            β”‚
            └─→ LayoutEnginePanel.Layout(HTMLBody)
                    β”‚
                    β”œβ”€β†’ Layout HTMLDiv.header:
                    β”‚     β”‚
                    β”‚     β”œβ”€β†’ LayoutEngineText("Product List", 24pt)
                    β”‚     β”‚     β”‚
                    β”‚     β”‚     β”œβ”€β†’ Measure: width=150pt, height=24pt
                    β”‚     β”‚     └─→ Create PDFLayoutLine
                    β”‚     β”‚
                    β”‚     └─→ Add margin-bottom: 20pt
                    β”‚           Total height: 44pt
                    β”‚
                    β”œβ”€β†’ Layout HTMLDiv (container):
                    β”‚     β”‚
                    β”‚     β”œβ”€β†’ Layout HTMLDiv.item (1):
                    β”‚     β”‚     β”‚
                    β”‚     β”‚     β”œβ”€β†’ Add margin: 10pt
                    β”‚     β”‚     β”œβ”€β†’ Add padding: 5pt
                    β”‚     β”‚     β”œβ”€β†’ LayoutEngineText("Widget: $10.00")
                    β”‚     β”‚     β”‚     └─→ Measure: 80pt x 12pt
                    β”‚     β”‚     β”œβ”€β†’ Add border: 1pt
                    β”‚     β”‚     └─→ Total: 102pt x 34pt
                    β”‚     β”‚
                    β”‚     └─→ Layout HTMLDiv.item (2):
                    β”‚           └─→ (same process)
                    β”‚                 Total: 102pt x 34pt
                    β”‚
                    └─→ Create PDFLayoutDocument:
                            β”‚
                            └─→ Page 1:
                                  β”œβ”€β†’ Block (y=0, h=44pt): "Product List"
                                  β”œβ”€β†’ Block (y=44pt, h=34pt): "Widget: $10.00"
                                  └─→ Block (y=78pt, h=34pt): "Gadget: $25.50"

7. Render (Layout β†’ PDF):

PDFLayoutDocument.OutputToPDF(writer)
    β”‚
    β”œβ”€β†’ Write page object
    β”‚
    β”œβ”€β†’ Write content stream:
    β”‚     β”‚
    β”‚     β”œβ”€β†’ Block 1 (header):
    β”‚     β”‚     β”‚
    β”‚     β”‚     β”œβ”€β†’ Set color: 0.2 0.4 0.6 rg
    β”‚     β”‚     β”œβ”€β†’ Set font: /F1 24 Tf
    β”‚     β”‚     β”œβ”€β†’ Position: 0 750 Td
    β”‚     β”‚     └─→ Draw text: (Product List) Tj
    β”‚     β”‚
    β”‚     β”œβ”€β†’ Block 2 (item 1):
    β”‚     β”‚     β”‚
    β”‚     β”‚     β”œβ”€β†’ Draw border:
    β”‚     β”‚     β”‚     10 706 92 24 re
    β”‚     β”‚     β”‚     S
    β”‚     β”‚     β”‚
    β”‚     β”‚     β”œβ”€β†’ Set font: /F1 12 Tf
    β”‚     β”‚     β”œβ”€β†’ Position: 15 711 Td
    β”‚     β”‚     └─→ Draw text: (Widget: $10.00) Tj
    β”‚     β”‚
    β”‚     └─→ Block 3 (item 2):
    β”‚           └─→ (similar)
    β”‚
    └─→ Write font resources:
          /Font << /F1 5 0 R >>

Result PDF:
    Page with formatted content

Design Patterns

1. Component Pattern

Purpose: Uniform treatment of individual and composite components

Structure:

  • IComponent: Common interface
  • Leaf components: Label, Image, Shape
  • Composite components: Panel, Page, Table

Benefits:

  • Recursive operations (Init, Load, DataBind)
  • Uniform lifecycle management
  • Easy to add new component types

2. Factory Pattern

Purpose: Decouple component creation from usage

Examples:

  • HTMLParserComponentFactory: HTML tag β†’ Component
  • ImageFactoryList: Image format β†’ Image handler
  • FontFactory: Font family β†’ Font resource

Benefits:

  • Centralized creation logic
  • Easy to extend with new types
  • Configuration-driven instantiation

3. Strategy Pattern

Purpose: Select algorithm at runtime

Examples:

  • IPDFLayoutEngine: Different layout strategies per component type
  • CSSStyleAttributeParser<T>: Different parsing strategies per CSS property
  • ImageFactoryBase: Different decoding strategies per format

Benefits:

  • Algorithm encapsulation
  • Easy to add new strategies
  • Runtime selection based on component type

4. Visitor Pattern (Context Objects)

Purpose: Separate operations from object structure

Examples:

  • InitContext, LoadContext, DataContext passed through tree
  • Operations (Init, Load, DataBind) implemented as methods
  • Context accumulates state without modifying components

Benefits:

  • Components remain stateless
  • Easy to add new operations
  • Clear separation of concerns

5. Template Method Pattern

Purpose: Define algorithm skeleton, defer steps to subclasses

Examples:

  • LayoutEngineBase.Layout(): Common setup, subclasses implement specifics
  • CSSStyleAttributeParser.DoSetStyleValue(): Framework calls, subclass implements

Benefits:

  • Code reuse through inheritance
  • Enforces consistent workflow
  • Extension points for customization

6. Flyweight Pattern

Purpose: Share common data to reduce memory

Examples:

  • ISharedResource: Fonts and images cached and shared
  • Style objects marked immutable after calculation
  • Font metrics shared across all uses

Benefits:

  • Reduced memory footprint
  • Smaller PDF file size
  • Faster resource lookup

7. Builder Pattern

Purpose: Construct complex objects step by step

Examples:

  • PDFWriter: Builds PDF structure incrementally
  • Document construction through parsing
  • Layout item construction during layout phase

Benefits:

  • Stepwise construction
  • Immutable result
  • Clear construction process

Extension Architecture

These guides need updating and separating out. Feel free to read, but the examples and descriptions should not be relied on.

There are other examples in the Configuration Guide that are more complete.

Adding Custom Components

1. Define Component Class:

public class CustomBanner : Panel
{
    public string BannerText { get; set; }
    public PDFColor BannerColor { get; set; } = PDFColors.Blue;

    protected override void DoDataBind(DataContext context)
    {
        base.DoDataBind(context);

        // Add custom data binding logic
        if(!string.IsNullOrEmpty(BannerText))
        {
            var label = new Label();
            label.Text = BannerText;
            label.ForeColor = BannerColor;
            this.Contents.Add(label);
        }
    }
}

2. Register with Factory (for HTML parsing):

public class CustomComponentFactory : HTMLParserComponentFactory
{
    public CustomComponentFactory()
    {
        // Register custom tag
        this.RegisterTag("banner", typeof(CustomBanner));
    }
}

// Use custom factory
var parser = new HTMLParser(new CustomComponentFactory());
var doc = parser.Parse(html);

3. Use in HTML:

<banner banner-text="Welcome!" banner-color="#FF0000" />

Adding Custom CSS Properties

1. Define Style Property:

public class CustomStyle : StyleBase
{
    public string CustomProperty { get; set; }
}

2. Create Parser:

public class CSSCustomParser : CSSStyleAttributeParser<CustomStyle>
{
    public CSSCustomParser()
    {
        // Register CSS property name
        this.RegisterProperty("custom-property");
    }

    protected override bool DoSetStyleValue(Style style,
                                           CSSStyleItemReader reader,
                                           PDFContextBase context)
    {
        string value = reader.CurrentTextValue;

        if(reader.CurrentAttribute == "custom-property")
        {
            style.Custom.CustomProperty = value;
            return true;
        }

        return false;
    }
}

3. Register Parser:

CSSStyleParser.RegisterParser(new CSSCustomParser());

4. Use in CSS:

.my-class {
    custom-property: "my value";
}

Adding Custom Expression Functions

1. Implement Function:

public class UpperFunction : IFunction
{
    public string Name => "upper";

    public object Evaluate(IExpression[] arguments, Context context)
    {
        if(arguments.Length != 1)
            throw new ArgumentException("upper() requires 1 argument");

        var value = arguments[0].Evaluate(context);
        return value?.ToString().ToUpper();
    }
}

2. Register Function:

var context = new Context();
context.RegisterFunction(new UpperFunction());

3. Use in Template:

<div>{{upper(model.name)}}</div>

Adding Custom Layout Engine

1. Implement Engine:

public class CustomLayoutEngine : LayoutEngineBase
{
    public override void Layout(PDFLayoutContext context,
                                Component component)
    {
        CustomComponent custom = (CustomComponent)component;

        // Measure component
        PDFSize size = MeasureComponent(custom, context);

        // Create layout block
        PDFLayoutBlock block = context.DocumentLayout
                                     .CurrentPage
                                     .CreateBlock();
        block.Position = context.Space.Location;
        block.Size = size;

        // Layout children if any
        foreach(var child in custom.Contents)
        {
            LayoutChild(child, context);
        }

        // Register block
        context.DocumentLayout.CurrentPage.Add(block);
    }
}

2. Register Engine:

public class CustomComponent : VisualComponent
{
    public override IPDFLayoutEngine GetEngine()
    {
        return new CustomLayoutEngine();
    }
}

Performance Considerations

Memory Management

1. Resource Sharing:

  • Fonts and images cached at document level
  • Multiple references to same resource
  • Reduces memory footprint and PDF size

2. Lazy Loading:

  • External resources loaded only when needed
  • Images not decoded until layout phase
  • CSS parsed on demand

3. Layout Item Pooling:

  • Consider pooling for high-volume scenarios
  • Current implementation creates new objects
  • Potential optimization for large documents

Layout Performance

1. Two-Stage Rendering:

  • Layout phase separate from render phase
  • Allows optimization and pre-calculation
  • Can abort before rendering if layout fails

2. Incremental Layout:

  • Components laid out as encountered
  • Page breaks handled during layout
  • No need to layout entire document at once

3. Font Metrics Caching:

  • Character widths cached per font
  • Reduces measurement overhead
  • Critical for text-heavy documents

Async Operations

1. WASM Compatibility:

  • All I/O operations async
  • No blocking calls
  • Uses async/await throughout

2. Parallel Resource Loading:

  • Images and fonts can load in parallel
  • HttpClient for remote resources
  • Reduces total load time

3. Timer Execution:

  • DocumentTimerExecution yields periodically
  • Keeps UI responsive during generation
  • Essential for large documents in WASM

PDF Output Optimization

1. Stream Compression:

  • Content streams can be compressed
  • Uses FlateDecode filter
  • Reduces file size by 50-70%

2. Resource Deduplication:

  • Identical resources referenced once
  • Image hash checking
  • Font embedding optimized

3. Incremental Writing:

  • Objects written as created
  • No need to buffer entire PDF
  • Supports streaming to output

Profiling and Diagnostics

1. Trace Logging:

  • Performance logging available
  • Track time per pipeline stage
  • Identify bottlenecks

2. Memory Profiling:

  • Monitor resource cache size
  • Track layout item count
  • Detect memory leaks

3. Layout Diagnostics:

  • Can dump layout tree
  • Visualize box model
  • Debug positioning issues

Conclusion

Scryber.Core’s architecture enables sophisticated PDF generation through:

  1. Clean Separation: Specialized projects with clear responsibilities
  2. Extensibility: Multiple extension points for customization
  3. Performance: Optimized resource management and lazy loading
  4. Standards Compliance: CSS box model and HTML semantics
  5. Modern .NET: Multi-targeting, async/await, WASM compatible

The pipeline architecture (Parse β†’ Init β†’ Load β†’ DataBind β†’ Style β†’ Layout β†’ Render) provides natural breakpoints for debugging and extension, while the component model ensures consistent behavior across all element types.

Understanding this architecture enables effective development, debugging, and extension of the Scryber.Core PDF generation engine.