HTTP Protocol violation when downloading webpage using HtmlAgilityPack
See the question and my original answer on StackOverflowThis is not related to the Html Agility Pack directly, but rather to the underlying HTTP/socket layer. This error means the server is not sending back a correct HTTP status line.
The status line is defined in HTTP RFC available here: http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html
I quote:
The first line of a Response message is the Status-Line, consisting of the protocol version followed by a numeric status code and its associated textual phrase, with each element separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
You can add socket traces with full hex report to check this:
<configuration>
<system.diagnostics>
<sources>
<source name="System.Net.Sockets" tracemode="includehex">
<listeners>
<add name="System.Net.Sockets" type="System.Diagnostics.TextWriterTraceListener" initializeData="SocketTrace.log" />
</listeners>
</source>
</sources>
<switches>
<add name="System.Net.Sockets" value="Verbose"/>
</switches>
<trace autoflush="true" />
</system.diagnostics>
</configuration>
This will create a SocketTrace.log file in the current executing directory. Have a look in there, the protocol violation should be visible. You can post it here if it's not too big :-)
Unfortunately, if you don't own the server, there is not much you can do (if you already added the useUnsafeHeaderParsing setting, which is good) but fail gracefully in these cases.