Archive for the ‘C#’ Category
Interesting code with HtmlAgilityPack
Posted by fr3dr1k | Filed under Browsers, C#
Yesterday I was busy with HTML to PDF conversion and for this I used the HTML Agility Pack. Everything worked great, except it seemed IE and FF/Chrome render different HTML. So today I took some fairly straightforward HTML and pushed it through HTMLAgility:
New Website Under Construction
And if I use this code to loop through the childnodes:
HtmlDocument doc = new HtmlDocument();
string s;
StringBuilder builder = new StringBuilder();
using (StreamReader reader = new StreamReader(@"C:\Documents and Settings\user\Desktop\fremus.net\index.htm"))
{
while ((s = reader.ReadLine()) != null)
{
builder.AppendLine(s);
}
}
doc.LoadHtml(builder.ToString());
Console.WriteLine(doc.DocumentNode.ChildNodes.Count);
foreach (HtmlNode node in doc.DocumentNode.ChildNodes)
{
Console.WriteLine(node.Name);
foreach (HtmlNode childNode in node.ChildNodes)
{
Console.WriteLine("\t\t" + childNode.Name);
foreach (HtmlNode grandChildNode in childNode.ChildNodes)
{
Console.WriteLine("\t\t\t" + grandChildNode.Name);
}
}
}
I get the following result in my command line window:
As you can see from the output the html node has a text node. The head node has a text node, and it has 9 childnodes including 5 #text nodes. The body node has a text node as well, and it has 7 childnodes, four being #text and the other three being div. So what is this #text node? If you read this article on the W3C site you will see that it states:
A common error in DOM processing is to expect an element node to contain text.
However, the text of an element node is stored in a text node.
On the same page it then gives an example using a title tag. If you do a Google on “html #text node“, you will see that the second result points to an article and if you read the bit on the nodes it seems that each #text node is a child. The #text nodes that appear in the body node seem to point to the text spaces after each div or each element inside the body node. If I change my code slightly:
Console.WriteLine("\t\t" + childNode.Name);
foreach (HtmlNode grandChildNode in childNode.ChildNodes)
{
Console.WriteLine("\t\t\t" + grandChildNode.Name);
Console.WriteLine("\t\t\t\t" + grandChildNode.HasChildNodes);
}
It tells me that the divs have child elements, but the #text nodes do not. Thus it seems for each ‘empty space’ inside a node there exists a #text node. If I amend the HTML from earlier like this:
Then the footer div will have two text nodes, and the paragraph node will have a textnode. My issues yesterday had to do with the way IE rendered the HTML and that when I used HTMLAgility to parse it, the node counts weren’t the same. From the sample HTML I have given so far that difference is negligble, but I found that if I went to a site like this one and I saved the HTML from IE and Chrome into separate HTML files and I ran my code with that HTML, I got different node counts. Here are two screenshots that illustrate this:
The first screen is the html from the page saved from chrome and the second one is from ie. Notice the extra text nodes.
WCF – Getting the foundations right
Posted by fr3dr1k | Filed under C#, WCF
Ok so admittedly I have been using ASMX services for too long now and the time has come to kick it to the curb and adopt WCF. And the issue I have been having of late was that I was skimming through code just to get stuff done, without spending the time understanding some of the details.
Why would I want to adopt WCF? Well there are the list of reasons found in articles on MSDN, one whitepaper can be found here, and of particular interest is the combination of technologies and the general idea that interoperability is the main goal. But these things are just a way of promoting the technology, and its not until you understand what it can do that you realise what it is you are dealing with. And to help you get to that point you need to work through an example, and I found that after I worked through the “Getting Started Tutorial” example, a light went on and I was like, “ok I get it”. Essentially a WCF service is made up of two key elements (there is a third as well) but in terms of C# code there are two key elements:
*An interface marked as a Service Contract using the ServiceContract attribute and with the methods marked as OperationContracts using an attribute with the same name
*A class that implements the methods in the interface
The third part of a WCF service is the configuration settings which can be found in a web.config/app.config’s system.servicemodel tag. Within the servicemodel section you define service behaviours as well as endpoints. One of the keys to understanding WCF is knowing that a service is defined by its endpoint, see it as a consumer. WCF can be consumed by client web apps, Silverlight apps and desktop apps. The endpoints themselves have configuration settings as well specifically relating to message sizes.
From the tutorial I was able to see that you can run a WCF service in a browser, without having IIS running. Thats something I need to think about but it does pose a few interesting questions. After I did the tutorial I wanted to do a simple REST service, and that took a few minutes but eventually got that sorted. StackOverflow was quite helpful and so was several articles on MSDN, with this one being the most helpful.
Using System.Uri’s Segments property and List().ForEach
Posted by fr3dr1k | Filed under C#
You know the feeling when you see someone use a property and you go, nice, I never knew that. Well last night that happened to me after reading Scott Hanselman’s article on Windows Powershell. In the article he created a script that automatically downloaded his podcasts. To do this he use the System.Uri class which has a property called Segments. Segments returns a string array which consists of elements in a Uri separated by a forward slash ‘/’. So lets say you have this url:
http://developer.yahoo.com/yap/guide/caja-support.html
You could then use System.Uri to get all the bits in the Uri like this:
Uri url = new Uri("http://developer.yahoo.com/yap/guide/caja-support.html");
string[] arrUri = url.Segments;
var item = from u in arrUri
select u;
foreach (string s in item)
{
Console.WriteLine(s);
}
Something else I started doing or using yesterday is the ToList().ForEach delegate method. I noticed it in the LinqToTwitter api:
var twitterTrends = from trends in tCtx.Trends
select trends;
twitterTrends.ToList().ForEach(t =>
Console.WriteLine(t.Query));
The ForEach works on a list as well, so if you take the same code I wrote earlier using a foreach loop for the Uri segments you can rewrite that as:
item.ToList().ForEach(s => Console.WriteLine(s));
The variable s is an anonymous type so it infers from the type what type it is. That is shorter code, not sure if its more efficient, but it sure looks nicer.
Getting POST values with an ASHX file
Posted by fr3dr1k | Filed under AJAX, C#, Web Development, Web Technologies
Today I had this scenario where I wanted to post items from multiple HTML input elements to a generic handler (.ashx) file without using the action attribute of the form. Specifying the action meant that that you are navigated away from the page where the action is happening, which means re-creating UI logic. How did I achieve this? By using an XMLHttpRequest and using a POST method. GET places everything inside a querystring, which is ok, but I just wondered what would happen if the content was too long for the querystring. I guess the same can be said for POST, but it just seems POST uses a different way to transfer the data. So lets say you had this JavaScript:
getXMLHTTPPostObject: function(url, elementName, parameters) {
if (window.XMLHttpRequest) {
// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp = new XMLHttpRequest();
}
else if (window.ActiveXObject) {
// code for IE6, IE5
xmlhttp = new ActiveXObject("Microsoft.XMLHTTP");
}
else {
alert("Your browser does not support XMLHTTP!");
}
xmlhttp.onreadystatechange = function() {
if (xmlhttp.readyState == 4) {
document.getElementById(elementName).innerHTML = xmlhttp.responseText;
}
}
xmlhttp.open("POST", url, true);
xmlhttp.send(parameters);
}
I use this function to create an XMLHttpRequest object by passing it:
- the URL for the AJAX call
- an elementName to put the result of request in
- A parameter list
I then have a function like this:
addPost: function() {
objXMLHTTP.getXMLHTTPPostObject("url to handler", "categoryTemp", "postTitle=" + document.getElementById("txtPostTitle").value + "&blogpost=" + document.getElementById("txtBlogPost").value);
}
The function gets the values of two HTML input elements and passes it as parameters. The parameters are then used in the POST HTTPMethod. My next challenge was to get the data in the generic handler (.ashx) so that I can process it. I also wanted to return the data from the ASHX file to see that its processed successfully. So in my handler I created this code:
context.Response.ContentType = "text/plain";
System.IO.Stream body = context.Request.InputStream;
System.Text.Encoding encoding = context.Request.ContentEncoding;
System.IO.StreamReader reader = new System.IO.StreamReader(body, encoding);
//if (context.Request.ContentType != null)
//{
// context.Response.Write("Client data content type " + context.Request.ContentType);
//}
string s = reader.ReadToEnd();
string[] content = s.Split('&');
for (int i = 0; i < content.Length; i++)
{
string[] fields = content[i].Split('=');
context.Response.Write("
" + fields[0] + "
");
context.Response.Write("
" + fields[1] + "
");
}
//context.Response.Write(s);
body.Close();
reader.Close();
I first create a class of type Stream that is instantiated through the Response.InputStream property, after which I set the content encoding for the response object. I then create a StreamReader instance and call its ReadToEnd method. After this I do some string manipulation and return the text back to the XMLHttpRequest object, which then writes the content to an HTML Element.
Getting a single XAttribute or XElement
Posted by fr3dr1k | Filed under C#, LINQ to XML
I have come across two ways querying XElements or XAttributes:
- Make an object IEnumerable and loop through the results
- Use the Single() method to return a single object only
Typically you may have a situation where you write some Linq-to-XML like this:
IEnumerableattrib = from att in elemInner.Attributes() where (string)att.Name.ToString() == "sectionName" select att;
To loop through the results you have to use a foreach loop like this:
foreach (XAttribute innerAttrib in attrib)
{
builder.AppendLine("
" + innerAttrib.Value +"
");
}
If however you just wanted a single attribute result, you could write the same code like this:
XAttribute attribDisplayStyle = (from att in elemInner.Attributes()
where (string)att.Name.ToString() == "sectionName"
select att).Single();
Then you only have a single XAttribute instance and you dont need a foreach loop:
attribDisplayStyle.Value;
You would apply the same logic to XElement as well.
Mimicking AJAX behaviour with Generic Handler (.ashx) file uploader
Posted by fr3dr1k | Filed under ASP.NET, C#, Web Development
Earlier today I posted a blog about using a generic handler (.ashx) to upload a file to a web server, and in the back of my head I wanted to use it somewhere neat and special, and I also want to find the most reliable and working version. And I also want to learn what the approaches are to doing so, and why not to do it a certain way, etc.
So back to the topic of the post and the first thing that needs to be understood is that you CANNOT upload files with AJAX/JavaScript. This is because of you cannot retrieve the contents of a file off a local system, and if this was so it would cause major security headaches. So what I ended up doing is following the IFRAME approach, by which you make it look as if the upload is happening all AJAX-like when in actual fact its not. So what I did was create an IFRAME:
Notice that I added a div called divTimerValue, which I use to display some progress indicator, in my case it will be busy that will grow and subside with dots. In the source file (where the IFRAME points to) I create a form:
Notice that it has the ReturnValue.ashx action and that I have added an onclick function to the submit button, which looks like this:
function dotsAnimate() {
parent.document.getElementById("divFrame").style.display = 'none';
var dotspan = parent.document.getElementById("divTimerValue");
dotspan.style.display = '';
setInterval(function() {
if (dotspan.innerHTML == 'busy...') {
dotspan.innerHTML = 'busy.';
}
else {
dotspan.innerHTML += '.';
}
}, 1000);
};
The dotsAnimate function uses the setTimeout function to create the animation. The generic handler then returns some HTML that contains a javascript function that clears the timeout and prints a message in the divTimerValue div:
context.Response.Write(@"");
context.Response.Write(savedFileName);
And this code produces the desired result.
Uploading files with a generic handler (.ashx)
Posted by fr3dr1k | Filed under ASP.NET, C#
In recent times, the last year or so, I have come to move away from web forms and move more towards an architecture that involves plain html combined with web services or in some cases generic handlers. I like this approach because it allows me to perfectly control the quality of the HTML that is rendered, which in turn is good for a few reasons, one being search engines and the other being true to web standards (or at least trying to). Its relatively easy to combine an asynchronous call with an ASHX file. ASHX files are lighter in terms of processing than their ASPX counterparts. With that in mind I decided to write or create a generic handler based on Scott Hanselman’s example here. And it works! Thats the key thing.
First create an HTML file and add code for a form like this:
Then you create the code in the ASHX file like this:
string savedFileName = "";
foreach(string file in context.Request.Files)
{
HttpPostedFile hpf = context.Request.Files[file] as HttpPostedFile;
if (hpf.ContentLength == 0)
continue;
savedFileName = context.Server.MapPath(Path.GetFileName(hpf.FileName));
hpf.SaveAs(savedFileName);
}
context.Response.Write(savedFileName);
Getting search results from Bing REST service and using LINQ to process results
Posted by fr3dr1k | Filed under Bing, C#, LINQ to XML
So today I started reading about the Bing API and I got myself an API key and I read through the basic instruction manual, which tells you how to get search results from the web through the Bing REST service. Its pretty straight forward, just get your own key though! But here is some sample code that does the trick and uses Linq to XML to process the results:
XDocument document = XDocument.Load("http://api.search.live.net/xml.aspx?Appid=&query=sushi&sources=web");
XElement root = document.Root;
XNamespace web = "http://schemas.microsoft.com/LiveSearch/2008/04/XML/web";
var searchItems = document.Descendants(web + "Results").SingleOrDefault();
IEnumerable testelem = from el in searchItems.Elements()
select el;
foreach (XElement myElem in testelem)
{
context.Response.Write(myElem.Value + "");
}
That is very easy!
What are extension methods?
Posted by fr3dr1k | Filed under C#
Extension methods in C# enable you to extend existing types without having to create a new type. This essentially means that you can take the class String and add methods to it that extends it, and when you create a string those methods become available. You just have to make sure you include the namespace of the class that implements the methods. A very basic example would be to create an extension method that returns the length of a string. So you start off first by creating:
namespace MyExtensionMethods
{
public static class MyExtensionMethods
{
public static int ReturnText(this String text)
{
return text.Length;
}
}
}
Notice the this keyword before String. Also notice that the class is static and that the method is static, which means that the class can only contain static methods. If you create an instance method in a static class you will get a compiler error. To use the extension method you include the using directive:
namespace ExtensionMethods2
{
using MyExtensionMethods;
class Program
{
static void Main(string[] args)
{
string s = "Test";
Console.WriteLine(s.ReturnText());
}
}
}
Notice that the string s now has an extra method associated to it called ReturnText.
C# and streams
Posted by fr3dr1k | Filed under Application Development, C#
C# uses a lot of streams, or at least it seems so in some of the code examples I have looked at.
Having said that I felt the need to read up on the Stream class. The stream class is an abstract class and an abstract acts as a type of base class from which other classes can inherit. And just to recap, abstract methods do not have a method body and must be implemented by the class inheriting from it. A method marked as virtual can be overridden but it does not have to be implemented. Methods that are marked as abstract can be overriden.
So the stream class is an abstract base class for a set of other classes that do various operations such as reading content from files, reading files in a directory, etc. Each of the classes that inherit from the base stream class can be seen as a type of stream. Streams can be read from, written to and they support seeking.
The classes that inherit from the Stream class can be divided into three broad categories:
- Classes that are used for File I/O
- Classes that are used for reading and writing to streams
- Common IO stream classes
The reason for taking a look at these classes is because I was looking at a PowerPoint presentation recently. More specifically I was looking at PowerPoint 2007 (pptx) and its underlying XML. If you save a Powerpoint 2007 as XML and you look at the source you will note that there are sections called pkg:binaryData. It got me wondering that maybe that same binary data can be processed in C# code. Take a look at the following sample C# code:
Image img = Image.FromFile(@"C:\Users\fredrike\Documents\LEGAL SUITE.bmp");
byte[] myByte = File.ReadAllBytes(@"C:\Users\fredrike\Documents\LEGAL SUITE.bmp");
Console.WriteLine(myByte.Length);
MemoryStream stream = new MemoryStream(myByte);
int lengthOfByte = Convert.ToInt32(stream.Length);
byte[] mySecondByte = ReadFully(stream, lengthOfByte);
Console.WriteLine(mySecondByte.Length);
Image img2 = Image.FromStream(stream);
img2.Save(@"C:\Users\fredrike\Documents\LEGAL SUITE3.bmp");
Notice a couple of things I have done in the code. I created a Image class instance and read it from an existing bitmap image. I then create a byte array and used the static ReadAllBytes method supplied by the File class. I then just simply wrote the length of the file to the screen because in the next couple of lines I create a MemoryStream instance using the byte array, after which I also get the length property of the MemoryStream instance because I use it in a function called ReadFully:
public static byte[] ReadFully(Stream stream, int initialLength)
{
// If we've been passed an unhelpful initial length, just
// use 32K.
if (initialLength < 1)
{
initialLength = 32768;
}
byte[] buffer = new byte[initialLength];
int read = 0;
int chunk;
while ((chunk = stream.Read(buffer, read, buffer.Length - read)) > 0)
{
read += chunk;
// If we've reached the end of our buffer, check to see if there's
// any more information
if (read == buffer.Length)
{
int nextByte = stream.ReadByte();
// End of stream? If so, we're done
if (nextByte == -1)
{
return buffer;
}
// Nope. Resize the buffer, put in the byte we've just
// read, and continue
byte[] newBuffer = new byte[buffer.Length * 2];
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
// Buffer is now too big. Shrink it.
byte[] ret = new byte[read];
Array.Copy(buffer, ret, read);
return ret;
}
I got this from Jon Skeet’s website. Be sure to check out his other C# articles. In my code I then created a second byte array which is generated by the ReadFully method. Once the bytes have finished reading I created a second Image instance, img2 and pass it the MemoryStream instance, and I then call img2’s Save method and create a second image based on the first one.
So my question is, can the binary data from Powerpoint presentations be used in the same way?
