Sunday, February 18, 2018

How to generate the Volpi book into a single file

Each page in the Volpi translation app is in a separate HTML file. I edit and update the individual files, and I don't maintain a single file with all of the book's pages.

However, you can generate a single HTML file with all the pages yourself, by pulling the individual pages from the app and concatenating them into a single file.

Here's how.

First create a start.html file with the HTML tags at the top:

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Franco Volpi - Heidegger and Aristotle</title>
</head>
<body>
<h1>Franco Volpi - Heidegger and Aristotle</h1>
<h2>Translated by Pete Ferreira</h2>
<hr/><br/><br/>


Then create a end.html file with the HTML tags at the bottom:

</body>
</html>


Then use this PowerShell script to concatenate start.html, all the pages from the app, and end.html.

$bookContent = Get-Content 'start.html' For ($pagenumber=1; $pagenumber -lt 118; $pagenumber++) { $paddedpagenumber = ("{0:D3}" -f $pagenumber) $url = "http://beyng.com/volpi/assets/EN/Volpi.$paddedpagenumber.html" $resp = Invoke-WebRequest -URI $url $bookContent += "<br/><br/><p style=""text-align:center"">$pagenumber</p>`n`r"
$bookContent += [system.Text.Encoding]::UTF8.GetString($resp.RawContentStream.ToArray()) } $bookContent += Get-Content 'end.html' $bookContent | Out-File VolpiBook.html

In between each page, the script inserts HTML with the page number. The padded page number is required for the page URLs - e.g. page 1 as 001. The UTF8.GetString stuff is required to keep the Greek characters from getting munged.