See the question and my original answer on StackOverflow

It depends on what you want to do programmatically after the text has been parsed. If you don't want to do anything special with it, the following code:

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml("<div><form>form and div</div>form</form>");

    doc.Save(Console.Out);

will display exactly the same string, that is:

<div><form>form and div</div>form</form>

Because the library was designed from the grounds up to try to keep the original Html as much as possible.

But in terms on how this is represented in the DOM, and in terms of errors, this is another story. You can't have at the same time 1) overlapping elements 2) XML-like DOM (which does not support overlaps) and 3) no errors.

So it depends on what you want to do after parsing.