DJVU is a PDF-like file format, yes, but this file doesn't appear to actually be in that format; it looks like a DJVU file was converted to XML as the "text" format, at the same time it was converted to a PDF (and the PDF download on that page works fine, even though the download takes forever). you can see the original DJVU file referenced in the XML source: file://localhost//var/tmp/autoclean/derive/blackbeautyautob00sewe_0//blackbeautyautob00sewe_0.djvu
the thing is, it is not clear at all why this was converted to XML like this. there's huge sections that contain literally no information, and the sections that do have the text of the document contain zero formatting information, because XML only contains the unformatted content, so when you convert the XML to text it all just runs together:
196 CHAPTER XXIX COCKNEYS T HEN there is the steam-engine style of driving; these drivers were mostly peo- ple from towns, who never had a horse of their own, and generally traveled by rail. They always seemed to think that a horse was something like a steam engine, only smaller. At any rate, they think that if only they pay for it, a horse is bound to go just as far, and just as fast, and with just as heavy a load as they please. And be the roads heavy and muddy, or dry and good ; be they stony or smooth, uphill or downhill, it is all the same — on, on, or, one must go at the same pace, with no relief, and no consideration. These people never think of getting out to walk up a steep hill. Oh, no, they have paid to ride, and ride they will! The horse? Oh, he’s used to it! What were horses made for if not to drag people 197 BLACK BEAUTY uphill? Walk! A good joke, indeed!
what this needs is a separate file that provides formatting instructions, and that doesn't appear to be available? normally there would be an XSL file available to format the raw text, but for some reason that isn't here, so there's no easy way to "recover" that formatting information, even if you plug the XML into a ebook reader or something; it would have to make a best guess and even Calibre, the gold standard for this, has no idea what to do