[jdom-interest] Detecting if file is XML

Mattias Jiderhamn mj-lists at expertsystems.se
Wed Jul 14 06:00:02 PDT 2010


Thanks all for your suggestions. For some reason, the responses in this 
thread didn't reach me until a month or so later.

In my particular case, the file extension was not to be trusted (or 
rather, would never be .xml), I wasn't interested in whether the XML was 
valid or not (at the point of file type detection), but I needed to take 
encodings and white space into account.

Anyway, I just shared the solution on ForkCan. See 
http://www.forkcan.com/viewcode/196/Detect-if-file-contains-XML

</Mattias>

----- Original Message -----
Subject: Re: [jdom-interest] Detecting if file is XML
Date: Mon, 07 Jun 2010 07:39:41 -0400
From: Rolf <jdom at tuis.net>

Mattias Jiderhamn wrote:
 > This is semi-off topic, but what is the best way - performance wise -
 > to determine if a file is an XML file or not, from Java?
 >
I have done this recently...

I can't share the code, but I created a 'simple' SAX Handler that, after
a configured amount ('X' amount) of 'valid' XML throws a custom
exception. The 'calling code' listens for any exceptions. If it
successfully parses the file, or it gets the special 'custom' exception,
then it knows that the file is small and valid (no exception), or is
larger, and the first 'X' amount of the file is valid XML.

The SAX Handler does nothing with the content except count
'startElement()' and 'characters()' method calls.

Makes for a pretty fast, efficient, validating handler..... without any
unneeded 'overhead'.

Sure, there may be a better way, but this worked for me...

Rolf




More information about the jdom-interest mailing list