Blog Articles

Scaling and Slicing PDF Documents with pdfjam and mutool

en

I recently encountered the following challenge: I had a PDF document consisting of multiple A4 pages, each of which I needed to print scaled up to A2. However, I only had an A4 printer available, and the printer driver was not able to perform the scaling and/or slicing on its own.

After a long search, and lots of disappointments (since neither of Libre Office, Inkscape and Evince could do what I wanted, or I simply couldn't find the feature), I came up with the following solution:

$ pdfjam -o scaled.pdf --a2paper input.pdf
$ mutool poster -x 2 -y 2 scaled.pdf sliced.pdf

The first command takes the input file, scales it up to A2 size, and writes it to an intermediate file. The second command slices each page from the intermediate file into 2 slices both vertically and horizontally, totaling in 4 slices per page, which again results in A4-sized slices the printer can print.

Automated Debian package building with Gitlab CI and Reprepro

en

I previously deployed some ad-hoc services using ugly and difficult to maintain solutions, such as binaries manually extracted from container images and copied into /usr/local/bin. Of course, this is a lot of manual work for each upgrade of the service, and if you upgrade often, a lot of repetitive manual work. Or, in other words, the perfect kind of work to be automated.

So I decided to package the software in question for Debian, which is the Linux distribution I run these services on, and to automate the process of building the packages and publishing them to a repository. There's three different "categories" of software I wanted to have in my repository:

  • Binaries published as build artifacts in releases of their upstream repository. An example for this is Gitea. The binaries can be automatically downloaded and put into a package.

  • Software released as container images only. An example for this is the Drone CI Server. This is a bit more tricky, as the binaries must be extracted from the image before it can be put into a package.

  • Software already released as Debian packages, but not available through a repository. An example for this is a project of my own, the iCalendar Timeseries Server, where the CI pipeline automatically builds Debian packages and adds them as build artifacts to releases. Here, the already-built package just needs to be fetched and added to the repository.

For creating and maintaining the repository, I'm using Reprepro, as it is extremely simple to use, and generally is rather uncomplex and lightweight. It does however come with some limitations; I especially encountered the problem that Reprepro will only keep the latest version of a package in the repo index, so older versions won't be available to clients.

For automating the packaging and repository build process, I'm using Gitlab CI. The pipeline is running each night, building the latest stable version for each package, adding the packages to the in-container Reprepro repository, and then synchronize the repository to a web server of mine.

The source repository with the package build scripts and pipeline configuration is available on Gitlab. Though, before using this as a template, be advised that the packages built by this pipeline are not exceptionally high-quality. They are also usually built to fit my personal needs, so don't expect ready-to-use packages.

Recording IPTV Using ffmpeg

en

I don't usually watch TV. But from time to time there is something interesting on the programme, such as debates on local politics. Unfortunately, those usually run at a time of day where I'm not able (or more likely not willing) to tune in an pay attention to an hour of political discourse. So I want to record them and watch later instead.

My ISP provides its IPTV programme as MPEG-TS streams via multicast UDP. And they even link a M3U playlist of all stations on their website, so you can basically watch TV with any client whatsover, as long as it speaks IGMP and understands MPEG-TS video streams. This makes recording very easy, as this is supported by a lot of multimedia processing software, including ffmpeg.

The playlist consists of a list of TV stations, each of which is represented by its own multicast group and an UDP port. So let's just take the first station and see what ffmpeg finds in there:

Input #0, mpegts, from 'udp://239.77.0.77:5000':
  Duration: N/A, start: 41892.675600, bitrate: N/A
  Program 9038 
    Metadata:
      service_name    : SRF 1 HD
      service_provider: Schweizer Radio und Fernsehen
    Stream #0:0[0x50]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, progressive), 1280x720 [SAR 1:1 DAR 16:9], 50 fps, 50 tbr, 90k tbn, 100 tbc
    Stream #0:1[0x51](deu): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 192 kb/s (clean effects)
    Stream #0:2[0x52](eng): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 192 kb/s (clean effects)
    Stream #0:3[0x5b](deu): Audio: ac3 ([6][0][0][0] / 0x0006), 48000 Hz, 5.1(side), fltp, 448 kb/s (clean effects)
    Stream #0:4[0x6e](deu,deu): Subtitle: dvb_teletext ([6][0][0][0] / 0x0006)
    Stream #0:5[0x70]: Unknown: none ([5][0][0][0] / 0x0005)
    Stream #0:6[0x72]: Unknown: none ([12][0][0][0] / 0x000C)

We can see that the MPEG-TS stream contains multiple indiviual streams, which are listed in the output above. Now, I don't know what's up with 0:5 and 0:6, or why ffmpeg doesn't understand them. Anyway, I only need the video and one audio channel. Let's just pick the first two, and record an one hour TV show:

ffmpeg -f mpegts -i udp://239.77.0.77:5000 -map 0:0 -map 0:1 -c copy -t 3600 recording.mkv

To break it down:

  • -f mpegts tells ffmpeg that the input is an MPEG transport stream.
  • -i udp://239.77.0.77:5000 tells ffmpeg to join the specified multicast group and receive the MPEG-TS stream on UDP port 5000.
  • -map 0:0 -map 0:1 only extracts the streams 0:0 (the H.264 video stream) and 0:1 (the German audio channel in MP2)
  • -c copy causes the input stream to be demuxed only, and the selected streams to be written to the output without CPU-intensive decoding and reencoding.
  • -t 3600 terminates the stream after one hour, when the show is over.
  • recording.mkv is the output filename. The container format (here MKV) is deduced from the filename.

So the whole ffmpeg command takes the original stream as input, demultiplexes it to get the individual media streams, then throws out all but one video and one audio stream, multiplexes them into a Matroska container, which is then written to disk.

Extracting 3D Models From CesiumJS - Part 1: Terrain Map Scraping

en
This article is part of a series:
  1. Extracting 3D Models From CesiumJS - Part 1: Terrain Map Scraping

CesiumJS is a open source JavaScript framework for rendering 2D and 3D maps - everything from a local area to whole planets - in a web browser using WebGL. In the past few weeks I've been working on obtaining 3D model data in a situation where the only easily available way of accessing the data is through a CesiumJS based viewer. As far as I know, Cesium deals with two different kinds of 3D data: On one side, there's 3D models used for small-scale objects like buildings, on the other side there's terrain maps.

Addressing Terrain Tiles

To get started with the terrain map, I needed to figure out how to obtain terrain data for a certain geographical region. Luckily, this is fairly well documented. CesiumJS uses its own solution called quantized-mesh.

quantized-mesh supports different "zoom levels", for which the whole globe is divided into more and more single "tiles". At level 0, there are only two tiles: the first tile covers the western hemisphere, the second tile covers the eastern hemisphere. With each increase in zoom level, each tile is split into 4, each new tile containing a quadrant of the previous tile. Each tile can then be identified by its zoom level, a x coordinate and a y coordinate. x starts at 0, representing -180° longitude, and when incrementing x, you go eastward until you reach the tile ending at +180° longitude, at x=2^(z+1). y=0 starts at the south pole at -90° latitude, going north, and reaches +90° latitude at y=2^z. No +1 in the exponent here, since for full coverage, x needs to cover the full 360° longitude, while y only needs to cover half as much for a total of 180° latitude.

Using these three variables, z, x and y, the quantized-mesh specification defines an URL template for addressing an individual tile via HTTP:

http://example.com/tiles/<z>/<x>/<y>.terrain

This can be a little hard to imagine, so I attempted to visualize the first two zoom levels in Figure 1.

Zoom level 0: 2 tiles, each a hemisphere Zoom level 1: 8 tiles, each an octant
Figure 1: Tiles at zoom level 0 and 1: At level 0, there are 2 tiles, each tile covering a hemisphere. At level 1, there are 8 tiles in total. Each tile from level 0 was divided into 4 tiles

If you're familiar with OpenStreetMap, you may recognize this way of dividing the globe into tiles and addressing individual tiles. An OpenStreetMap tile URL looks like this: https://tile.openstreetmap.org/14/8537/5725.png. This similarity is not accidental; in fact, the quantized-mesh tiling schema was designed to follow the Tile Map Service standard's tiling schema.

All of the above assumes our data source uses a WGS84 projection and the TMS tiling schema. quantized-mesh supports other configurations as well, where there is only a single tile at level 0, or with the x and y coordinates swapped. You can find more information in the documentation.

Mapping Geographical Regions to Terrain Tiles

Now that we know how quantized-mesh tiles are addressed, let's find out which tiles we actually need. In my use case, I wanted to obtain all tiles in a bounding box defined by lower and upper latitudes and longitudes. Converting x and y to coordinates is quite easy:

lat = -90 + y * 180 / (2**z)
lon = -180 + x * 180 / (2**z)

So to get the ranges for x and y, we solve those equations for x and y and add proper rounding:

x_min = floor( lon_min * (2**z)/180 + 2**z     )
x_max = ceil(  lon_max * (2**z)/180 + 2**z     )
y_min = floor( lat_min * (2**z)/180 + 2**(z-1) )
y_max = ceil(  lat_max * (2**z)/180 + 2**(z-1) )

The resulting ranges, with x_max and y_max being exclusive upper bounds, are then formulated as

tiles_x = range(x_min, x_max)
tiles_y = range(y_min, y_max)

So once we know at which zoom level our tiles are available, we can compute which tiles to download in order to fully cover our region.

Scraping Terrain Tiles

Now all that's left to to is to figure out how to actually download the individual tiles. Your web browser's inspector can be a great help here. Open the Cesium-based application in your web browser, and open the inspector's network tab. As you move around in the map, you will most likely see a lot of requests, which can be grouped into the following categories:

  • Image tiles, addressed in the same way as the terrain tiles. I didn't need these for my use case, but you should be able to scrape them the same way I'm scraping the terrain tiles.
  • 3D models, with file extensions of b3dm, glb and cmpt. I'll take a look at some of those in later articles.
  • The terrain tiles, with a file extension of terrain. These are the ones we want to obtain.

As you zoom in and out of the map, you'll see that the zoom levels in the requests to image and terrain tiles will increase or decrease. Since I wanted the most detailed tiles, I just continued with the maximal zoom level that was available in the application i was working with.

Now just right click on one of the terrain requests, and navigate to Copy / Copy as cURL. You'll get something like this:

curl 'https://maps.example.com/terrain/tiles/12/12345/12345.terrain' \
  -H 'Origin: https://maps.example.com' \
  -H 'Accept-Encoding: gzip, deflate, br' \
  -H 'User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; ...)' \
  -H 'Accept: application/vnd.quantized-mesh,application/octet-stream;q=0.9' \
  -H 'Referer: https://maps.example.com/' \
  --compressed

At least the Accept header is usually required, and the quantized-mesh specification recommends setting it, especially since it's used to tell the server which extensions to the quantized-mesh standard are supported by the client, if any. Some other headers, especially the Origin, Referer and User-Agent may be required as a "soft form of access control", depending on the server's configuration. I found it worked best to just keep the entire curl request as-is, and only modify the z, x and y parameters in the URL.

Knowing the supported values for z and our ranges for x and y, we can now easily script the download of the individual files. A small hint regarding the filenames: Use all three parameters, z, x and y in the output filename, otherwise you end up overwriting the same files over and over again.

Once the download is done, we are left with a bunch of binary .terrain files. The next article will cover how to parse them.

Mounting a 2.5" Drive Inside an APU2

en

At home, I'm using a PC Engines APU2 as a firewall and WiFi access point. Since this is the only device in my home constantly running and constantly connected to the internet, I decided to use it as an additional backup site for my servers.

However, for this to become viable, I needed to add around 200 GB of storage. I considered the following options:

  • Add more mSATA storage
  • Attach external (USB?) storage
  • Attach a 2.5" drive internally

I quickly discarded the second option, since this approach would take up more space, which could become challenging especially since I wall-mounted the device. As for the other two options, I had a few spare 2.5" drives lying around, so I figured I'd try to use those first before buying new storage.

The Problem

Space inside the APU2 case is tight. There is some 9-10mm space between the highest parts of the system board (pin headers, capacitors, mPCIe and mSATA cards) and the case cover, at least for most parts of the board's footprint. Sure, I could just try to glue the drive to the case ceiling, but that would be both ugly and extremely cumbersome to handle. Also, I really wanted to prevent the disk from ever touching anything on the board, since most disk casings are made from metal, and the things it could touch on the board are stuffs like pin headers.

The Solution

So I decided to come up with a 3D-printed mount. It consists of 3 parts: two side parts, which are put between the board and the case walls, and a center part, on which the disk is screwed in place, held in place by the two side parts. Effectively, a "bridge" over the system board.

The first attempt already turned out pretty well, but a few problems became apparent:

  • I took a wrong measurement at one point and had to move some cutouts so they would properly align with the pin headers they were meant for.
  • The plugs of SATA cables (at least the ones I had lying around, as well as the PC Engines satacab1) extended below the "base line" of the disk they are plugged into. This was solved by adding another hole into the 3D-printed part.
  • Due to the limited space, this design only works for drives 7mm high. The much more common 9.5mm high drives won't fit, or at least put some stress on the printed part, and possible the system board.

So I ended up using a 250GB SSD (the only 7mm drive I had lying around). However, when the PC Engines-specific SATA cable ("satacab1", required because there is no SATA power connector on the APU2 board) arrived, another problem popped up:

The cable is both quite short and rather rigid, so if you use it to connect the drive to the SATA connector on the APU2 board, you end up putting some stress on both the connector and the printed parts. This can be easily mitigated by putting a short SATA extension cable in between the satacab1 and the board's connector.

The Result

I've published the resulting design on Thingiverse. Alternatively, you can download the files directly from here. The design is licensed under the CC BY-SA 4.0 license.

The results of my print can be seen in the following photos. I made this print out of PLA using an Ultimaker 3.

APU2 in its case with the mount on top, but without a disk
attached APU2 in its case with a mounted SSD on
top