| Subcribe via RSS

Reading Files from a Directory using C#

April 24th, 2009 | Comments Off | Posted in C#, General, HTML, IIS, Programming, VB.NET
So I had a project which I needed to find out what files were being held in a given directory and write those back to the browser. This is very helpful in automating posting of files to a website by your users without you being involved in tedious static html updates. The key here is the System.IO .NET class. We are going to access the DirectoryInfo class so we can iterate through and pull back all file objects contained within. In the code below, the DirectoryInfo looks in the current directory as we’ve defined through Server.MapPath(“.”). We could just have easily directed this through stepping up the directory (i.e. c:\intepub\wwwroot\directoryread\filestoread\). Next, we tell it to grab all the files that end with an .aspx extension which screens out any text or pdf files we may not be interested in. In the FileInfo class, we extract the name of each file to wire up the hyperlink and print this back to the user. I’ve taken it one step further to strip out the hyphens and the .aspx file extension that we print back to the user so the file name “Reading-files-from-a-directory-using-csharp.aspx” becomes “Reading files from a directory using cSharp”. Much cleaner and user friendly. That FileInfo class has lots of cool properties including LastWriteTime, CreationTime, Length (for a full rundown on the FileInfo properties checkout Microsoft’s .NET Framework Developer Center). Tags: , ,

Inserting XML into ASP.NET DataList

March 3rd, 2009 | Comments Off | Posted in ASP.NET, C#, XML
No heavy lifting today just back to basics working with ASP.NET’s DataList control. So I’ve worked to convert the album listings of a popular music download site into an XML format so I can display them on my sites. Going the route of a DataGrid or Gridview would be too constraining for my needs since I want to display pictures and have two columns per row. I want to customize the layout in other words and that isn’t DataGrid’s strong suit. Enter the DataList. With the DataList, I can customize the data flow in the Item Template as well as set how many columns to repeat and in which direction. All this in very little code. First we need to declare the System.Data object so we can put our XML document into a DataSet. Once there we’ll bind it to the DataList which is defined here as dlAlbums. Now looking at our DataList, we set the properties for RepeatDirection as well as RepeatColumns to define how our cells are going to write to the table. From there, we pull in our values from the XML document, filling in our appropriate placeholders. It doesn’t get much easier than that. Tags: , ,

Create Excerpt from Description Blurb

February 25th, 2009 | Comments Off | Posted in C#, SQL

So the task at hand is taking a verbose piece of description text and breaking it down to the first 300 characters. Not only are we going to trim this to 300 characters, but we are going to drop off all the text that follows the last period. This will serve as teaser copy on our store page to get our fair reader to click over to read the full description if interested. So our first step will be taking that large block and cutting it down to 300 characters:

This SQL query basically says we want the left 300 characters of the description field, and we are going to store it in a variable we call excerpt. Now comes the tricky part of dealing with the text after the period. I’ve created a console application in Visual Studio to help whittle down the results. The following sets up our initial query:

Now we need to loop through that results set so we can pair down this data.

The program doesn’t need to worry about any descriptions that already end in a period or those ending as a url string so let’s skip them.

Here’s where we identify where the last period in our content block resides.

Tags: , ,

Check Web Page Updates Through Screen Scrape

February 24th, 2009 | Comments Off | Posted in ASP.NET, C#

I’ve got a fair share of websites that draw data from a event promoter’s website. I’d originally done research on utilizing their web services model, but was severely disappointed with their implementation. That left me with two essential problems I needed to solve:

1) I needed to know when certain events were updating so I could keep my websites up to date.

2) I needed to be able to parse through the raw html and parse out the relevant data to generate an XML file that my GridView can pull to display the relevant data to my vistors.

The first task was to build a Console App that I could setup as a Scheduled Task to run nightly to tell me which pages had updated. So up Visual Studio goes. I’ve chosen not to hard code the page values so I can easily add additional pages to check in the future. I’ve stored these values in a text file called parsePages.txt as follows:

yahoo,http://www.yahoo.com google,http://www.google.com

These values could be just as easily stored in a database or an xml document. Next, I go about pulling this data into the program so I know which pages I need to parse through. I setup the regular expression to split up the values into an ArrayList from parsePages.txt so I can start making my checks.

Now that I have my list of files to parse through, I’m going to loop through each — checking to see if the file exists then executing the parse routine.

Here is where we do our file verification. We are using two text files for each value — one to house our current content and one for the previous days content. This way we can see what changes, if any, have been made since the previous day. If for some reason either the main file or the alt file doesn’t exist, we want to create it. This is handy for new events we’ve just added to the list.

So far we’ve done a lot of work, but we still haven’t gotten to the meat of what we’re trying to do here. Well never fear because here comes the parse routine. So we pass in the reference and the url into the ScreenScrape method. We use the System.Net.WebRequest object to create our url and the WebResponse to instantiate it.

We spit the html contents of the page into a StreamReader then do a little cleanup to strip off the header and footer html that we aren’t concerned with.

Next, we need to compare the main text file with the alt text file to identify which one was written to most recently so we can overwrite the alternate file with the html we were chewing on in the StreamReader.

Now if we found any change between the old and the new file, we need to know about it. If the file lengths don't match, we write a line out to the console saying an update was found.

Tags: , , , , ,