What have you tried?

A simple approach would be to (a) use a DocumentBuilder to build a org.w3c.dom.Document instance from your file. Then you can (b) go through the document to find all the...