<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>nathan buggia</title>
	<atom:link href="http://nathanbuggia.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://nathanbuggia.com</link>
	<description>Bing UX / Bing Webmaster / Microsoft Enterprise Architecture / PalmOS</description>
	<lastBuildDate>Fri, 28 Oct 2011 04:41:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Browser View Controller (iPhone)</title>
		<link>http://nathanbuggia.com/posts/browser-view-controller/</link>
		<comments>http://nathanbuggia.com/posts/browser-view-controller/#comments</comments>
		<pubDate>Mon, 24 Oct 2011 06:18:33 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">http://nathanbuggia.com/?p=356</guid>
		<description><![CDATA[<p>Github source: <a href="https://github.com/nbuggia/Browser-View-Controller--iPhone-" target="_blank">BrowserViewController</a></p> <p>iPhone apps often have the need to show a web page, and the easiest way to implement this is to have the page opened in Safari. The problem with this, is that now your customer is stuck in Safari, and they might not know how to get back into your [...]]]></description>
			<content:encoded><![CDATA[<p>Github source: <a href="https://github.com/nbuggia/Browser-View-Controller--iPhone-" target="_blank">BrowserViewController</a></p>
<p>iPhone apps often have the need to show a web page, and the easiest way to implement this is to have the page opened in Safari. The problem with this, is that now your customer is stuck in Safari, and they might not know how to get back into your app. This project gives you all the boilerplate code you need to create a smooth experience opening web pages within your app, and seamlessly get back with one click.</p>
<p><a href="http://nathanbuggia.com/wp-content/uploads/2011/10/Browser-View-Controller-1.png"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border-width: 0px;" title="Browser-View-Controller-1" src="http://nathanbuggia.com/wp-content/uploads/2011/10/Browser-View-Controller-1_thumb.png" alt="Browser-View-Controller-1" width="165" height="244" border="0" /></a>           <a href="http://nathanbuggia.com/wp-content/uploads/2011/10/Browser-View-Controller1.png"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border-width: 0px;" title="Browser-View-Controller" src="http://nathanbuggia.com/wp-content/uploads/2011/10/Browser-View-Controller_thumb1.png" alt="Browser-View-Controller" width="164" height="244" border="0" /></a></p>
<p>Here are the scenarios implemented:</p>
<ul>
<li><strong>Opening a URL from within a method </strong>– useful for opening links triggered by a UIButton or UITableView.</li>
<li><strong>Opening a URL from within a UITextView </strong>– useful for links embedded within text strings that UITextView can automatically identify and turn into clickable hyperlinks.</li>
<li><strong>Opening a URL from within a UIWebView</strong> – useful for when you are using a UIWebView to render formatted text in your application with hyperlinks.</li>
</ul>
<h3>Getting Started</h3>
<p>Please see GitHub for instructions on using the library: <a href="http://github.com/nbuggia/Browser-View-Controller--iPhone-/blob/master/README.md">http://github.com/nbuggia/Browser-View-Controller&#8211;iPhone-/blob/master/README.md</a></p>
<p>Please let me know if there are any additional features you would like to see in the comment section below!</p>
<h3>Thanks</h3>
<p>I’d like to thank <a href="http://penandthink.com/" target="_blank">Joseph Wain</a> of <a href="http://glyphish.com/" target="_blank">Glypish</a> fame for providing the arrow icons, and making them freely available to everyone. Go buy the <a href="http://glyphish.com/" target="_blank">best iphone icons</a> from Glypish!</p>
<p>I’d also like to thank <a href="http://www.qrayon.com/home/" target="_blank">Chen-I Lim</a> for the fix for getting this to work with the Facebook auth system. Go check out his many wonderful apps from <a href="http://www.qrayon.com/home/" target="_blank">Qrayon</a>.</p>
<h3>Other options</h3>
<p>There is one well-known library that does this today: <a href="https://github.com/facebook/three20/blob/master/src/Three20UI/Sources/TTWebController.m" target="_blank">TTWebController</a>, which is part of the well known <a href="https://github.com/facebook/three20" target="_blank">Three20</a> iOS library published by Facebook as open source. The only problem with this library is that it requires you to incorporate the whole of Three20 in your app, and doesn’t help with opening links in UIWebViews or UITextViews. However, it is still a solid, well written library you should consider.</p>
<p><strong>References</strong></p>
<ul>
<li><a href="http://stackoverflow.com/questions/2543967/how-to-intercept-click-on-link-in-uitextview">How to Intercept Clicks on Links in UITextView</a> – stackoverflow.com</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/browser-view-controller/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Kindle (on) Fire</title>
		<link>http://nathanbuggia.com/posts/the-kindle-on-fire/</link>
		<comments>http://nathanbuggia.com/posts/the-kindle-on-fire/#comments</comments>
		<pubDate>Thu, 29 Sep 2011 05:24:18 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[Finance]]></category>

		<guid isPermaLink="false">http://nathanbuggia.com/?p=290</guid>
		<description><![CDATA[<p>I watched Amazon&#8217;s announcement today, and I have to say I&#8217;m very, very impressed.</p> <p>The Kindle Fire</p> <p>The big news is that the device looks great, only costs $200, and unlimited streaming is free with the $80/ year amazon prime. It was so cheap that I looked at the picture, looked at the price and just &#8220;1-click&#8221; [...]]]></description>
			<content:encoded><![CDATA[<p>I watched Amazon&#8217;s announcement today, and I have to say I&#8217;m very, very impressed.</p>
<p><strong>The Kindle Fire</strong></p>
<p>The big news is that the device looks great, only costs $200, and unlimited streaming is free with the $80/ year amazon prime. It was so cheap that I looked at the picture, looked at the price and just &#8220;1-click&#8221; bought it before my brain really even had a chance to process it. Unless the media selection is terrible, I&#8217;m going to pay for amazon prime and cancel netflix. I think a lot of people are going to be doing that between now and Christmas, and even more with their Christmas Amazon gift certificates.</p>
<p>The big question that everyone in the media is asking is &#8220;Will this be an iPad killer?&#8221;, and the answer is not entirely. I think we&#8217;ve learned a lot about how people will use tablets in the past two years since Apple invented the space. And I think people generally fall into 2 camps: (1) Aspiring laptop replacement, and (2) Anywhere media consumption device. Most of the negative press about the iPad has been customers that fall into bucket 2, who like the device, but can&#8217;t understand why it costs so much. This is the group of people the Kindle Fire is made for, and I think Amazon is going to win over tons of them because they have a great product at a truly phenomenal price.</p>
<p>But can Amazon make any money at this? IDC had estimated the whole table market to be 46 million devices by 2014, but Apple is already selling more than 9 million per quarter, so I expect the market size to be significantly higher than that. Assuming that amazon gets 25% of the current market and sells 9 million Kindle Fires next year, that&#8217;s a whopping 1.8 billion in revenue (current annual revenue is forecast to be ~50 billion in 2011) That&#8217;s reasonable, given that Samsung sold about 10 million galaxy tablets world wide last year. Guess how many units Samsung could could have sold if they were Amazon?</p>
<ul>
<li>Half the price of the Galaxy tablet</li>
<li>High promotion in the largest online store in the world (80 million people per month!)</li>
<li>Killer story for great Movie &amp; TV content</li>
<li>The best andriod app store, a free app every day, and no malware</li>
<li>No tax and free shipping (unless you live in Seattle)</li>
</ul>
<p>But this is just revenue, what about profit? I&#8217;m guessing that they are breaking even, or losing money on each of these tablets &#8211; so any profit they make will be on the back of the profits for media purchased through the device. Which Amazon already sells at a super slim margin. This is the part of the business model I don&#8217;t quite understand &#8211; Apple charges more than Amazon for all content, and that still ins&#8217;t significant enough to count for much on their balance sheet. How&#8217;s amazon going to make a business out of this when they&#8217;ve cut margins on every side of the equation? Plus, they incur the added cost of the cloud services that come free with the device. This is all before the patent trolls wind up their lawyers to take their cut.</p>
<p>The bottom line is, while I love the product and the effect it will have on the industry, I&#8217;m not ready to invest in AMZN, especially with their super high PE ratios.</p>
<p><strong>The Other Kindles</strong></p>
<p>While you certainly can read books on your Kindle Fire, I don&#8217;t think that anymore people will do that, than the number of people who read them on the iPad. And we can tell by the prior kindle sales numbers that it is still a large enough segment to continue to invest in the black and white kindles. I have seen a significant number of people who carry both iPads and Kindles because the reading experience on a Kindle is so much better. I actually miss my black and white kindle enough to have bought one of those too.</p>
<p>I am very impressed with their Ad Subsidized (e.g. special offers) Kindles. They&#8217;ve managed to generate at least $30 in ad revenue, while making ads that in no way harm the experience of the device. I can only imagine that this amount will increase as the further optimize the ads and scale out their ad sales team (not to mention start leveraging their plethora of customer data to tailor the ads to each customer and provide analytics back to their advertisers). This could also be part of their plan for how to increase the margin&#8217;s on their Kindle Fire.</p>
<p>The only thing I don&#8217;t understand about this product line is why it is so complicated. There are 4 devices to chose from: No touch, Touch, Keyboard and DX. It really seems like they could benefit at multiple levels by simplifying their product line to Touch and DX.</p>
<p>I&#8217;m also concerned about this kindle from the profitability perspective as well. The only thing I can imagine is that they are taking an early loss to own the ebook market, and they will use that market share to disintermediate the big publishers and allow authors to publish direct to market. Then Amazon will split the publisher&#8217;s cut with the Authors and finally create a good margine and a highly defensible monopoly, I mean business.</p>
<p><strong>The New Browser</strong></p>
<p>Who cares. No, seriously. I think they&#8217;ve created some really interesting technology to solve a problem that isn&#8217;t really holding people back &#8211; at least not on a wifi device. Now throw in an AT&amp;T 3G connection and this starts to get really interesting. But then, they haven&#8217;t announced 3G, at least not yet&#8230;</p>
<p><strong>What should Apple&#8217;s Response Be?</strong></p>
<p>There is no way that this isn&#8217;t a threat to Apple&#8217;s core business, business model and profit margin. The more I think about this, the less I believe this is competing head to head with the iPad, I really think it is competing more with the iPod Touch. If you assume that people aren&#8217;t going to use the Kindle Fire for eBooks, and they aren&#8217;t going to try to replace their laptop, then they are mostly going to end up: Watching TV/ Movies &amp; Music, Surfing the Web, and Playing Games. Exactly what people do who only have an iPod Touch.</p>
<p>Apple is going to have to do this, the iPod is too small for games &amp; movies and the iPad is too big and heavy. Apple knows this, which is why they spent so  much R&amp;D making the iPad 2 33% thinner and lighter than the first iPad. But it isn&#8217;t enough. This is also a giant opportunity for them to disrupt the hand held video game market, which is currently more than $2.5 billion / year divided between the Nintendo DS, PSP and the iPod Touch. That&#8217;s not anywhere near as big as the iPad market, but for the iPod touch, that&#8217;s not too shabby. And with Apple&#8217;s superior hardware and developer ecosystem, they should be able to build a sustainable advantage here that Amazon would have a hard time competing with. So, which would your kids want:</p>
<p>$200 Kindle Fire</p>
<ul>
<li>Cheap media player all the poor kids have (okay, poor kids with media players)</li>
<li>Can only play angry birds and simple &#8220;phone&#8221; games</li>
<li>Have to re-buy media from Amazon</li>
</ul>
<p>$350 Apple iPod 7</p>
<ul>
<li>HiRes screen</li>
<li>More beautiful than the Kindle Fire</li>
<li>All your existing Music, Movies and TV episodes</li>
<li>Premium video games, direct integration with Apple Game Center and AppleTV</li>
</ul>
<p>Alright, maybe that isn&#8217;t a fair comparison, but I don&#8217;t think people will be comparing these devices logically.</p>
<p>&nbsp;</p>
<p>Additional Reading</p>
<ul>
<li><a href="http://phx.corporate-ir.net/phoenix.zhtml?c=176060&amp;p=irol-newsArticle&amp;ID=1610968&amp;highlight=">Official Amazon Press Release</a> (Amazon)</li>
<li><a href="http://thisismynext.com/2011/09/28/amazon-kindle-tablet-pictures-videohands-on/">Product Review and Video</a> (ThisIsMyNext)</li>
<li><a href="http://daringfireball.net/2011/09/amazons_new_kindles">Good Strategic Assessment</a> (John Gruber)</li>
<li><a href="http://www.businessweek.com/magazine/the-omnivore-09282011.html">Historical Context</a> (BusinessWeek)</li>
<li><a href="http://cdespinosa.posterous.com/fire">Good Analysis of Amazon&#8217;s Silk Browser</a> (Chris Espinosa)</li>
</ul>
<p>&nbsp;</p>
<p>References</p>
<ul>
<li><a href="http://www.comscore.com/Press_Events/Press_Releases/2010/3/comScore_Releases_Results_of_Study_on_Apple_iPad">How people use their iPads</a> (ComScore)</li>
<li><a href="http://blog.flurry.com/bid/31566/Apple-iPhone-and-iPod-touch-Capture-U-S-Video-Game-Market-Share">Size of the Hand-held gaming market</a> (Flurry)</li>
<li><a href="http://www.bing.com/Finance/search?q=AMZN&amp;FORM=DTPFIO">Amazon Financial Information</a> (Bing Finance)</li>
<li><a href="http://www.bing.com/Finance/search?q=AAPL&amp;FORM=DTPFIO">Apple Financial Information</a> (Bing Finance)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/the-kindle-on-fire/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Managing Search Engine Access to Your Site</title>
		<link>http://nathanbuggia.com/posts/managing-search-engine-access-to-your-site/</link>
		<comments>http://nathanbuggia.com/posts/managing-search-engine-access-to-your-site/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 09:19:04 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[Search & SEO]]></category>

		<guid isPermaLink="false">http://nathanbuggia.com/?p=312</guid>
		<description><![CDATA[<p>Note: Originally published on http://janeandrobot.com</p> <p>Controlling what content is blocked from being found in search engines is crucial for many websites. Fortunately, the major search engines and other well-behaved robots observe the <a href="http://www.robotstxt.org/" target="_blank">Robots Exclusion Protocol</a> (REP), which has evolved organically since the early 1990′s to provide a set of controls over what parts [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Note: Originally published on http://janeandrobot.com</strong></p>
<p>Controlling what content is blocked from being found in search engines is crucial for many websites. Fortunately, the major search engines and other well-behaved robots observe the <a href="http://www.robotstxt.org/" target="_blank">Robots Exclusion Protocol</a> (REP), which has evolved organically since the early 1990′s to provide a set of controls over what parts of a web site search engines robots can crawl and index.</p>
<p>Article Sections:</p>
<ul>
<li><a href="#Capabilities_of_the_REP">Capabilities of REP</a></li>
<li><a href="#Deciding_what_should_be_Public_vs._Private">Deciding What Should be Public vs. Private</a></li>
<li><a href="#Implementing_the_REP">Implementing the REP</a>
<ul style="margin-bottom: 0pt;">
<li><a href="#Site_Level_Implementation_(Robots.txt)">Site Level</a></li>
<li><a href="#Page_Level_Implementation_(META_Tags)">Page Level (Meta Tags)</a></li>
<li><a href="#HTTP_Header_Implementation_(X-ROBOTS-Tag)">Page Level (HTTP Header)</a></li>
<li><a href="#Content_Level_Implementation">Content Level</a></li>
</ul>
</li>
<li><a href="#Common_implementation_mistakes">Common Mistakes</a></li>
<li><a href="#Testing_your_implementation_">Testing Your Implementation</a></li>
<li><a href="#removal">Removing Content From Search Engine Indices</a></li>
<li><a href="#Additional_Resources:_">Additional Resources</a></li>
</ul>
<h2><a name="Capabilities_of_the_REP"></a>Capabilities of the REP</h2>
<p>The Robots Exclusion Protocol provides controls that can be applied at the site level (robots.txt), at the page level (META tag, or X-Robots-Tag), or at the HTML element level to control both the crawl of your site and the way it’s listed in the search engine results pages (SERPs). Below is a table listing the common scenarios, directives, and which search engines support them.</p>
<table style="border-width: 1px !important;">
<tbody>
<tr>
<td valign="top"><strong>Use Case</strong></td>
<td style="text-align: center;" valign="top"><strong>Robots.txt</strong></td>
<td style="text-align: center;" valign="top"><strong>META/ X-Robots-Tag</strong></td>
<td style="text-align: center;" valign="top"><strong>Other</strong></td>
<td style="text-align: center;" valign="top"><strong>Supported By</strong></td>
</tr>
<tr>
<td valign="top">Allow access to your content</td>
<td style="text-align: center;" valign="top">Allow</td>
<td style="text-align: center;" valign="top">FOLLOWINDEX</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40364">Google</a><a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html">Yahoo</a><a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Disallow access to your content</td>
<td style="text-align: center;" valign="top">Disallow</td>
<td style="text-align: center;" valign="top">NOINDEXNOFOLLOW</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=35303">Google</a><a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html">Yahoo</a><a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Disallow access to index images on the page</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">NOIMAGEINDEX</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=79892">Google</a></td>
</tr>
<tr>
<td valign="top">Disallow the display of a cached version of your content in the SERP</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">NOARCHIVE</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35306=">Google</a><a href="http://help.yahoo.com/l/us/yahoo/search/deletion/basics-10.html">Yahoo</a><a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Disallow the creation of a description for this content in the SERP</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">NOSNIPPET</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35304">Google</a><a href="http://www.ysearchblog.com/archives/000587.html">Yahoo</a><a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Disallow the translation of your content into other languages</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">NOTRANSLATE</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/help/faq_translation.html#donttrans">Google</a></td>
</tr>
<tr>
<td valign="top">Do not follow or give weight to links within this content</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">NOFOLLOW</td>
<td style="text-align: center;" valign="top">a href attribute:rel=NOFOLLOW</td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=96569">Google</a><a href="http://www.ysearchblog.com/archives/000069.html">Yahoo</a><a href="http://blogs.msdn.com/livesearch/archive/2005/01/18/nofollow_tags.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Do not use the <a href="http://www.dmoz.org/" target="_blank">Open Directory Project</a> (ODP) to create descriptions for your content in the SERP</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">NOODP</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35264">Google</a><a href="http://help.yahoo.com/l/us/yahoo/search/indexing/indexing-11.html">Yahoo</a><a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Do not use the Yahoo Directory to create descriptions for your content in the SERP</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><span style="font-family: 'Tahoma','sans-serif'; font-size: 10pt;">NOYDIR</span></td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx">Yahoo</a></td>
</tr>
<tr>
<td valign="top">Do not index this specific element within an HTML page</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">class=robots-nocontent</td>
<td style="text-align: center;" valign="top"><a href="http://www.ysearchblog.com/archives/000444.html">Yahoo</a></td>
</tr>
<tr>
<td valign="top">Stop indexing this content after a specific date</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">UNAVAILABLE_AFTER</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html">Google</a></td>
</tr>
<tr>
<td valign="top">Disallow the creation of enhanced captions</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">NOPREVIEW</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://bing.com/community">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Specify a sitemap file or a sitemap index file</td>
<td style="text-align: center;" valign="top">Sitemap</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=64748">Google</a><a href="http://www.ysearchblog.com/archives/000437.html">Yahoo</a><a href="http://blogs.msdn.com/livesearch/archive/2007/04/11/discovering-sitemaps.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Specify how frequently a crawler may access your website</td>
<td style="text-align: center;" valign="top">Crawl-Delay</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://google.com/webmaster">Google WMT</a></td>
<td style="text-align: center;" valign="top"><a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-03.html">Yahoo</a><a href="http://blogs.msdn.com/webmaster/archive/2008/04/18/ramping-up-msnbot.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Authenticate the identity of the crawler</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top">Reverse DNS Lookup</td>
<td style="text-align: center;" valign="top"><a href="http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html">Google</a><a href="http://www.ysearchblog.com/archives/000460.html">Yahoo</a><a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx">Microsoft</a></td>
</tr>
<tr>
<td valign="top">Request removal of your content from the engine’s index</td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"></td>
<td style="text-align: center;" valign="top"><a href="http://google.com/webmaster">Google WMT</a><a href="http://siteexplorer.search.yahoo.com">Yahoo SE</a><a href="http://webmaster.live.com/">Microsoft WMT</a></td>
<td style="text-align: center;" valign="top"><a href="http://googlewebmastercentral.blogspot.com/2007/04/requesting-removal-of-content-from-our.html">Google</a><a href="http://help.yahoo.com/l/us/yahoo/search/siteexplorer/delete/">Yahoo</a>Microsoft</td>
</tr>
</tbody>
</table>
<h2><a name="Deciding_what_should_be_Public_vs._Private"></a>Deciding What Should be Public vs. Private</h2>
<p>One of the first steps in managing the robots is knowing what type of content should be public vs. private. Start with the assumption that by default, everything is public, then explicitly identify the items that are private.</p>
<p>If you want search engines to access all the content on your site, you don’t need a robots.txt file at all. When a search engine tries to access the robots.txt file on your site and the server can’t return one (ideally by returning a 404 HTTP status code), the search engine treats this the same as a robots.txt file that allows access to everything.</p>
<p>Every website and every business has a different set of needs, so there’s no blanket rule for what to make private, but some common elements may apply.</p>
<ul>
<li><strong>Private data</strong>– You may have content on your site that you don’t want to be searchable in search engines. For instance, you may have private user information (such as addresses) that you don’t want surfaced. For this type of content, you may want to use a more secure approach that keeps all visitors from the pages (such as password protection). However, some types of content are fine for visitor access, but not search engine access. For instance, you may run a discussion forum that is open for public viewing, but you may not want individual posts to appear in search results for forum member names.</li>
<li><strong><a name="noncontent"></a>Non-content content</strong> – Some content, like <a href="http://nathanbuggia.com/post/Effectively-Using-Images.aspx#noncontent">images used for navigation</a>, provides little value to searchers. It’s not harmful to include these items in search engine indices, but since search engines allocate limited bandwidth to crawl each site and limited space to store content from each site, it may make sense to block these items to help direct the bots to the content on your site that you do want indexed.</li>
<li><strong>Printer-friendly pages</strong>– if you have specific pages (URLs) that are formatted for printing you may want to block them out to avoid duplicate content issues. The drawback to allowing the printer-friendly page to be indexed is that it could potentially be listed in the search results instead of the default version of the page, which wouldn’t provide an ideal user experience for a visitor coming to the site through search.</li>
<li><strong>Affiliate links and advertising</strong>– If you include advertising on your site, you can keep search engine robots from following the links by redirecting them to a blocked page, then on to the destination page. (There are other methods for implementing advertising-based links as well.)</li>
<li><strong>Landing pages</strong>– Your site may include multiple variations of entry pages used for advertising purposes. For instance, you may run AdWords campaigns that link to a particular version of a page based on the ad, or you may print different URLs for different print ad campaigns (either for tracking purposes or to provide a custom experience related to the ad). Since these pages are meant to be an extension of the ad, and are generally near duplicates of the default version of the page, you may want to block these landing pages from being indexed.</li>
<li><strong>Experimental pages</strong> – As you try new ideas on your site (for instance, using A/B testing), you likely want to block all but the original page from being indexed during the experiment.</li>
</ul>
<h2><a name="Implementing_the_REP"></a>Implementing the REP</h2>
<p>REP is flexible and can be implemented a number of ways. This flexibility lets you easily specify some policies for your entire site (or subdomain) and then enhance them more granularly at the page or link level as needed.</p>
<h3><a name="Site_Level_Implementation_(Robots.txt)"></a>Site Level Implementation (Robots.txt)</h3>
<p>Site wide directives are stored in a robots.txt file, which must be located in the root directory of each domain or sub-domain (e.g. <a href="http://nathanbuggia.com/robots.txt">http://janeandrobot.com/robots.txt</a>.) Note that robots.txt files only apply to the hostname where they are placed, and do not apply to subdomains. So a robots.txt file located on <a href="http://microsoft.com/robots.txt">http://microsoft.com/robots.txt</a> will not apply to the MSDN subdomain <a href="http://msdn.microsoft.com/">http://msdn.microsoft.com</a>. However, the robots.txt file does apply to all subfolders and pages within the specified hostname.</p>
<p>A robots.txt file is a UTF-8 encoded file that contains entries that consist of a user-agent line (that tells the search engine robot if the entry is directed at it) and one or more directives that specify content that the search engine robot is blocked from crawling or indexing. A simple robots.txt file is shown below.</p>
<div class="csharpcode">
<pre class="alt">User-agent: *</pre>
<pre class="alt">Disallow: /private</pre>
</div>
<p><code>user-agent:</code> – Specifies which robots the entry applies to.</p>
<ul>
<li>Set this to <code>*</code>to specify that this entry applies to all search engine robots.</li>
<li>Set this to a specific robot name to provide instructions for just that robot. You can find a complete list of robot names at <a href="http://www.robotstxt.org">robotstxt.org</a>.</li>
<li>If you direct an entry at a particular robot, then it obeys that entry <em>instead</em> of any entries defined for <code>user-agent: * </code>(rather than in addition to those entries).</li>
</ul>
<p>The major search engines have multiple robots that crawl the web for different types of content (such as images or mobile). They generally begin all robots with the same name so that if you block the major robot, all robots for that search engine are blocked as well. However, if you want to block only the more specific robot, you can block it directly and still allow web crawl access.</p>
<ul>
<li><strong><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40364">Google</a></strong>– The primary search engine robot is Googlebot.</li>
<li><strong><a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html">Yahoo!</a></strong>– The primary search engine robot is Slurp.</li>
<li><strong><a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx">Live Search</a></strong> – The primary search engine robots is MSNbot.</li>
</ul>
<p><code>Disallow: </code>- Specifies what content is blocked</p>
<ul>
<li>Must begin with a slash (<code>/</code>).</li>
<li>Blocks access to any URLs that begin with the characters after the <code>/</code>. For instance, <code>Disallow: /images</code> blocks access to <code>/images/</code>, <code>/images/image1.jpg</code>, and <code>/images10</code>.</li>
</ul>
<p>You can specify other rules for search engine robots in addition to the standard instructions that block access to content as noted in <a href="#other">other robot instructions</a>.</p>
<p>Some things to note about robots.txt implementation:</p>
<ul>
<li>The major search engines support pattern matching using the asterisk character (*) for wildcard match and the dollar sign ($) for end of sequence matching as described below in <a href="#patterns">using pattern matching</a>.</li>
<li>The robots.txt file is case sensitive, so <code>Disallow: /images </code>would block <code>http://www.example.com/images</code> but not <code>http://www.example.com/Images</code>.</li>
<li>If conflicts exist in the file, the robot obeys the longest (and therefore generally more specific) line.</li>
</ul>
<h4>Basic Samples</h4>
<p><em><strong>Block all robots</strong></em></p>
<p>Useful when your site is in pre-launch development and isn’t ready for search traffic.</p>
<div class="csharpcode">
<pre class="alt"># This keeps out all well-behaved robots.</pre>
<pre class="alt"># Disallow: * is not valid.</pre>
<pre class="alt">User-agent: *</pre>
<pre class="alt">Disallow: /</pre>
<pre class="alt"></pre>
</div>
<p><em><strong>Keep out all bots by default</strong></em></p>
<p>Blocks all pages except those specified. Not recommended as is difficult to maintain and diagnose.</p>
<div class="csharpcode">
<pre class="alt"># Stay out unless otherwise stated</pre>
<pre class="alt">User-agent: *</pre>
<pre class="alt">Disallow: /</pre>
<pre class="alt">Allow: /Public/</pre>
<pre class="alt">Allow: /articles/</pre>
<pre class="alt">Allow: /images/</pre>
<pre class="alt"></pre>
</div>
<p><strong><em>Block specific content</em> </strong></p>
<p>The most common usage of robots.txt.</p>
<div class="csharpcode">
<pre class="alt"># Block access to the images folder</pre>
<pre class="alt">User-agent: *</pre>
<pre class="alt">Disallow: /images/</pre>
<pre class="alt"></pre>
</div>
<p><a name="allow"></a><strong><em>Allow specific content</em> </strong></p>
<p>Block a folder, but allow access to selected pages in that folder.</p>
<div class="csharpcode">
<pre class="alt"># Block everything in the images folder</pre>
<pre class="alt"># Except allow images/image1.jpg</pre>
<pre class="alt">User-agent: *</pre>
<pre class="alt">Disallow: /images/</pre>
<pre class="alt">Allow: /images/image1.jpg</pre>
<pre class="alt"></pre>
</div>
<p><em><strong>Allow specific robot</strong></em></p>
<p>Block a class of robots (for instance, Googlebot), but allow a specific bot in that class (for instance, Googlebot-Mobile).</p>
<div class="csharpcode">
<pre class="alt"># Block Googlebot access</pre>
<pre class="alt"># Allow Googlebot-Mobile access</pre>
<pre class="alt">User-agent: Googlebot</pre>
<pre class="alt">Disallow: /</pre>
<pre class="alt">User-agent: Googlebot-Mobile</pre>
<pre class="alt">Allow: /</pre>
<pre class="alt"></pre>
</div>
<h4>Pattern Matching Examples</h4>
<p>The major engines support two types of pattern matching.</p>
<ul>
<li><strong>*</strong>matches any sequence of characters</li>
<li><strong>$</strong> matches the end of URL</li>
</ul>
<p><em><strong>Block access to URLs that contain a set of characters</strong></em></p>
<p>Use the asterisk (*) to specify a wildcard.</p>
<div class="csharpcode">
<pre class="alt"># Block access to all URLs that include an ampersand</pre>
<pre class="alt">User-agent: *</pre>
<pre class="alt">Disallow: /*&amp;</pre>
<pre class="alt"></pre>
</div>
<p>This directive would block search engines from crawling <code>http://www.example.com/page1.asp?id=5&amp;sessionid=xyz</code>.</p>
<p><strong><em>Block access to URLs that end with a set of characters</em> </strong></p>
<p>Use the dollar sign ($) to specify end of line.</p>
<div class="answer_heading">
<div class="csharpcode">
<pre class="alt"># Block access to all URLs that end in .cgi</pre>
<pre class="alt">User-agent: *</pre>
<pre class="alt">Disallow: /*.cgi$</pre>
<pre class="alt"></pre>
</div>
<p>This directive would block search engines from crawling <code>http://www.example.com/script1.cgi</code> but <em>not</em> from crawling <code>http://www.example.com/script1.cgi?value=1</code>.</p>
<p><em><strong>Selectively allow access to a URL that matches a blocked pattern</strong></em></p>
<p>Use the <code>Allow</code> directive in conjunction with pattern matching for more complex implementations.</p>
<div class="csharpcode">
<pre class="alt"># Block access to URLs that contain ?</pre>
<pre class="alt"># Allow access to URLs that end in ?</pre>
<pre class="alt">User-agent: *</pre>
<pre class="alt">Disallow: /*?</pre>
<pre class="alt">Allow: /*?$</pre>
<pre class="alt"></pre>
</div>
<p>That directive blocks all URLs that contain <code>?</code> except those that end in <code>?</code>. In this example, the default version of the page will be indexable:</p>
<ul>
<li><code>http://www.example.com/productlisting.aspx?</code></li>
</ul>
<p>Variations of the page will be blocked:</p>
<ul>
<li><code>http://www.example.com/productlisting.aspx?nav=price</code></li>
<li><code>http://www.example.com/productlisting.aspx?sort=alpha</code></li>
</ul>
<h4><a name="other"></a>Other robot instructions</h4>
</div>
<p><em><strong><span class="style2">Specify a Sitemap or Sitemap index file</span> </strong></em></p>
<p>If you’d like to provide search engines with a comprehensive list of your best URLs, you can provide one or more <a href="http://sitemaps.org" target="_blank">Sitemap</a> autodiscovery directives. Note, user-agent does not apply to this directive so you cannot use this to specify a Sitemap to some but not all search engines.</p>
<div class="csharpcode">
<pre class="alt"># Please take my sitemap and index everything!</pre>
<pre class="alt">Sitemap: <a href="http://janeandrobot.com/sitemap.axd">http://janeandrobot.com/sitemap.axd</a></pre>
<pre class="alt"></pre>
</div>
<p><strong><em>Reduce the crawling load</em> </strong></p>
<p>This only works with Microsoft and Yahoo. For Google you’ll need to specify a slower crawling speed through their <a href="http://google.com/webmaster" target="_blank">Webmaster Tools</a>. Be careful when implementing this because if you slow down the crawl too much, robots won’t be able to get to all of your site and you may lose pages from the index.</p>
<div class="csharpcode">
<pre class="alt"># MSNBot, please wait 5 seconds in between visits</pre>
<pre class="alt">User-agent: msnbot</pre>
<pre class="alt">Crawl-delay: 5</pre>
<p>&nbsp;</p>
<pre class="alt"># Yahoo's Slurp, please wait 12 seconds in between visits</pre>
<pre class="alt">User-agent: slurp</pre>
<pre class="alt">Crawl-delay: 12</pre>
<pre class="alt"></pre>
</div>
<h3><a name="Page_Level_Implementation_(META_Tags)"></a>Page Level Implementation (META Tags)</h3>
<p>The REP page-level directives allow you to refine the site wide policies on a page-by-page basis</p>
<p><strong><em>Placing a meta tag on the page</em> </strong></p>
<p>Place the meta tag in the head tag. Each directive should be comma delimited inside the tag. E.g. &lt;meta name=”ROBOTS” content=”Directive1, Directive 2&gt;.</p>
<div class="csharpcode">
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">html</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"><span class="kwrd"> &lt;</span><span class="html">head</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"><span class="kwrd"> &lt;</span><span class="html">title</span><span class="kwrd">&gt;</span>Your title here<span class="kwrd">&lt;/</span><span class="html">title</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"><span class="kwrd"> &lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="NOINDEX"</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"><span class="kwrd"> &lt;/</span><span class="html">head</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"><span class="kwrd"> &lt;</span><span class="html">body</span><span class="kwrd">&gt;</span>Your page here<span class="kwrd">&lt;/</span><span class="html">body</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;/</span><span class="html">html</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"></pre>
</div>
<p><strong><em>Targeting a specific search engine</em> </strong></p>
<p>Within the meta tag you can specify which search engine you would like to target, or you can target them all.</p>
<div class="csharpcode">
<pre class="alt"><span class="rem">&lt;!-- Applies to All Robots --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="NOINDEX"</span><span class="kwrd">&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="rem">&lt;!-- ONLY GoogleBot --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="Googlebot"</span> <span class="attr">content</span><span class="kwrd">="NOINDEX"</span><span class="kwrd">&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="rem">&lt;!-- ONLY Slurp (Yahoo) --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="Slurp"</span> <span class="attr">content</span><span class="kwrd">="NOINDEX"</span><span class="kwrd">&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="rem">&lt;!-- ONLY MSNBot (Microsoft) --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="MSNBot"</span> <span class="attr">content</span><span class="kwrd">="NOINDEX"</span><span class="kwrd">&gt;</span></pre>
</div>
<p><em>Control how your listings</em> – there are a set of options you can use to determine how your site will show up on the SERP. You can exert some control over how the description is created, and remove the “Cached page” link.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-85" style="border: black 1px solid;" title="example-serp" src="http://janeandrobot.com/wp-content/uploads/2008/06/example-serp.gif" alt="example-serp" /></p>
<div class="csharpcode">
<pre class="alt"><span class="rem">&lt;!-- Do not show a description for this page --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="NOSNIPPET"</span><span class="kwrd">&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="rem">&lt;!-- Do not use http://dmoz.org to create a description --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="NOODP"</span><span class="kwrd">&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="rem">&lt;!-- Do not present a cached version of the document in a search result --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="NOARCHIVE"</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"></pre>
</div>
<p><em><strong>Using other directives</strong></em></p>
<p>Other meta robots directives are shown below.</p>
<div class="csharpcode">
<pre class="alt"><span class="rem">&lt;!-- Do not trust links on this page, could be user generated content (UCG) --&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="NOFOLLOW"</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"><span class="rem">&lt;!-- Do not index this page --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="NOINDEX"</span><span class="kwrd">&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="rem">&lt;!-- Do not index any images on this page (will still index the if they are linked</span></pre>
<pre class="alt"><span class="rem"> elsewhere) Better to use Robots.txt if you really want them safe.</span></pre>
<pre class="alt"><span class="rem"> This is a Google Only tag. --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="GOOGLEBOT"</span> <span class="attr">content</span><span class="kwrd">="NOIMAGEINDEX"</span><span class="kwrd">&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="rem">&lt;!-- Do not translate this page into other languages--&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="NOTRANSLATE"</span><span class="kwrd">&gt;</span></pre>
<p>&nbsp;</p>
<pre class="alt"><span class="rem">&lt;!-- NOT RECOMMENDED, there really isn't much point in using these --&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOTS"</span> <span class="attr">content</span><span class="kwrd">="FOLLOW"</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"><span class="kwrd">&lt;</span><span class="html">meta</span> <span class="attr">name</span><span class="kwrd">="ROBOT"</span> <span class="attr">content</span><span class="kwrd">="UNAVAILABLE_AFTER"</span><span class="kwrd">&gt;</span></pre>
<pre class="alt"></pre>
</div>
<h3><a name="HTTP_Header_Implementation_(X-ROBOTS-Tag)"></a>HTTP Header Implementation (X-ROBOTS-Tag)</h3>
<p>Allows developers to specify page-level REP directives for non text/html content types like PDF, DOC, PPT, or dynamically generated images.</p>
<p><strong><em>Using the X-Robots-Tag</em> </strong></p>
<p>Use the X-Robots-Tag, simply add it to your header as shown below. To specify multiple directives you can either comma delimit them, or add them as separate header items.</p>
<div class="csharpcode">
<pre class="alt">HTTP/1.x 200 OK</pre>
<pre class="alt">Cache-Control: private</pre>
<pre class="alt">Content-Length: 2199552</pre>
<pre class="alt">Content-Type: application/octet-stream</pre>
<pre class="alt">Server: Microsoft-IIS/7.0</pre>
<pre class="alt">content-disposition: inline; filename=01 - The truth about SEO.ppt</pre>
<pre class="alt"><strong>X-Robots-Tag: noindex, nosnippet</strong></pre>
<pre class="alt">X-Powered-By: ASP.NET</pre>
<pre class="alt">Date: Sun, 01 Jun 2008 19:25:47 GMT</pre>
</div>
<p>&nbsp;</p>
<p>The X-Robots-Tag directive supports most of the same directives as the meta tag. The only limitation with this method over the meta tag implementation is that there is no way to target a specific robot – though that probably isn’t a big deal for most use cases.</p>
<ul>
<li><span style="font-family: courier new;">X-Robots-Tag: noindex</span></li>
<li><span style="font-family: courier new;">X-Robots-Tag: nosnippet</span></li>
<li><span style="font-family: courier new;">X-Robots-Tag: notranslate</span></li>
<li><span style="font-family: courier new;">X-Robots-Tag: noarchive</span></li>
<li><span style="font-family: courier new;">X-Robots-Tag: unavailable_after: 7 Jul 2007 16:30:00 GMT</span></li>
</ul>
<h3><a name="Content_Level_Implementation"></a>Content Level Implementation</h3>
<p>You can further refine your site level and page level directives within several content tags.</p>
<p>Each anchor tag (link) can be modified to tell search engines that you do not trust where this URL is pointing to. This is typically used for links within user generated content (UCG) like wikis, blog comments, reviews and other community sites.</p>
<div class="csharpcode">
<pre class="alt">&lt;a href="#" rel="NOFOLLOW"&gt;My Hyperlink&lt;/a&gt;</pre>
<pre class="alt"></pre>
</div>
<p>Also, in Yahoo Search you can specify which &lt;div&gt; elements on a page you would not like indexed using the <code>class=robots-nocontent</code> attribute. However, we don’t highly recommend using this tag because it is not supported in any other engine, making it not super-useful.</p>
<div class="csharpcode">
<pre class="alt">&lt;div class="robots-nocontent"&gt;</pre>
<pre class="alt">No content for you! (or at least Yahoo!)</pre>
<pre class="alt">&lt;/div&gt;</pre>
<pre class="alt"></pre>
</div>
<h2><a name="Common_implementation_mistakes"></a>Common Mistakes</h2>
<p>While implementing the REP is generally straight-forward, there are a few common mistakes.</p>
<p><strong><em>GoogleBot follows the most specific directive, ignoring all others</em> </strong></p>
<p>In the robots.txt file, if you specify a section for all user-agents (<code>user-agent: *</code>) and also declare a section for Googlebot (<code>user-agent: Googlebot</code>), Google will disregard all sections in the robots.txt file except the Googlebot section. This could potentially leave you exposing much more content to Google that you might have thought.</p>
<div class="csharpcode">
<pre class="alt"># This keeps out all well-behaved robots</pre>
</div>
<div class="csharpcode">
<pre class="alt">User-agent: *</pre>
</div>
<div class="csharpcode">
<pre class="alt">Disallow: /</pre>
</div>
<p class="csharpcode">
<div class="csharpcode">
<pre class="alt"># This looks like it is giving Google access to only this directory, but since it is a</pre>
</div>
<div class="csharpcode">
<pre class="alt"># GoogleBot specific section, Google will disregard the previous section</pre>
</div>
<div class="csharpcode">
<pre class="alt"># and access the whole site.</pre>
</div>
<div class="csharpcode">
<pre class="alt">User-agent: Googlebot</pre>
</div>
<div class="csharpcode">
<pre class="alt">Allow: /Content_For_Google/</pre>
<pre class="alt"></pre>
</div>
<p><strong><em>NOFOLLOW will most likely not prevent indexing</em> </strong></p>
<p>if you use <code>NOFOLLOW</code> at either the page or the link level, it is still possible for the links from the page to be indexed because the search engine may have found a reference to them from another source. Another note, using <code>rel="NOFOLLOW"</code> within your anchor text is still perceived as a recommendation by the search engines, not a command.To ensure that content is not indexed, either use the <code>Disallow</code> directive at the site level, or use <code>NOINDEX</code> at the page level.</p>
<p><strong><em>Directives that are not recommended</em> </strong></p>
<p>Directives in the REP are all about exceptions, by default the robots assume they can crawl your whole site. Therefore, you do not need to explicitly use the <code>FOLLOW</code> and <code>INDEX</code> directives as they will not be taken into account by the search engines. It sounds silly but I’ve seen a few sites that have implemented these on every page and every link.Another directive that is not recommended is the <code>NOCACHE</code> directive. This was created by Microsoft, and is synonymous with <code>NOARCHIVE</code>. While they will most likely always continue to support the directive, it is better to use <code>NOARCHIVE</code> so it will work on all the search engines.</p>
<p><strong><em>Be cognizant of case</em> </strong></p>
<p>When referencing files and URLs in the Robots.txt file, use a defensive approace to URL case, as the major engines do not handle it the same way. (e.g. /Files does not always equal /files).</p>
<h2><a name="Testing_your_implementation_"></a>Testing Your Implementation</h2>
<p>As you’re implementing your REP design, you should test it both before you deploy it and after. The easiest way to test this is to use the robots validator in either Google or Microsoft’s Webmaster Tools. These tools are generally good enough test beds for most folks, however advanced developers (or paranoid ones with critical business requirements) will want to definitively know what the robots are doing, not simply rely on what the robots say they are doing. These folks will want to look at their tools as well look at their server logs.</p>
<p>In addition to using validation tools, reporting tools from the search engines on what they couldn’t acces, and looking at logs data to see what the search engine robots are crawling, you should check the search engine results to see if any pages you are intending to block are being indexed. If they are, use the methods described in this section to ensure you are blocking them correctly and <a href="#removal">use the search engine tools to request that the pages be removed</a>.</p>
<p><a name="partial"></a><strong><em>When Blocked Content Appears to be Indexed</em> </strong></p>
<p>If search engines are blocked from crawling pages, they may still index the URL if the robot finds a link to that URL on a page that isn’t blocked. The listing may display the URL only, such as shown below.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-84" title="urlonly" src="http://janeandrobot.com/wp-content/uploads/2008/06/urlonly.gif" alt="urlonly" /></p>
<p>Or, it may include a title and in some instances, a description. This makes it appear as though the search engine robot is disregarding the directive that blocks access to the page, but the search engine is in fact obeying the directive not to crawl the page and is using anchor text from the link to that page and descriptive details from either the page that contains the link or a source such as the <a href="http://www.dmoz.org">Open Directory Project</a>.</p>
<p>For more details, see:</p>
<ul>
<li><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35667">Google: partially indexed page</a></li>
<li><a href="http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-01.html">Yahoo!: thin documents</a></li>
</ul>
<h3><a name="The_Easy_Way_"></a>The Easy Way</h3>
<p>Both Google and Microsoft provide some tools as part of their Webmaster Centers to help you verify if you’ve configured your REP the way you expect. Let’s start with Google’s tools:</p>
<p>The first thing you should check are the list of URLs that Google has seen from your website and not indexed due to the REP. Note you can also download the list and filter, sort, and have-your-way-with-it in Excel.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-83" style="border: black 1px solid;" title="webmaster-robotstxt-blocked1" src="http://janeandrobot.com/wp-content/uploads/2008/06/webmaster-robotstxt-blocked1.gif" alt="webmaster-robotstxt-blocked1" /></p>
<p>The next step is to use their interactive robots.txt tool to analyze your rules and test specific URLs for blockage. When you pull up the tool they already should have it pre-populated with the robots.txt file they have on file from the last time they crawled. You can input a list of URLs you’d like to check below, select the user-agent you’d like to check against and the tool will tell you if they are blocked or not. You can also use the tool to test changes to your robots.txt file to see how Google would interpret things.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-81" style="border: black 1px solid;" title="google-analyze-robotstxt" src="http://janeandrobot.com/wp-content/uploads/2008/06/google-analyze-robotstxt.jpg" alt="google-analyze-robotstxt" /></p>
<p>Microsoft has a similar tool in their <a href="http://webmaster.live.com/">Webmaster Center</a> that will validate a robots.txt file against the standard that MSNBot supports. To use the tool, simply log in copy &amp; paste your robots.txt file into the top field and select <strong>Validate</strong>. A list of all detectable issues are displayed in the bottom box.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-79" style="border: black 1px solid;" title="microsoft-robotstxt-validat" src="http://janeandrobot.com/wp-content/uploads/2008/06/microsoft-robotstxt-validat.jpg" alt="microsoft-robotstxt-validat" /></p>
<h3><a name="The_Hard_Way_(More_Accurate)"></a>The Hard Way</h3>
<p><strong><em>More Accurate Views of Robot Access Through Your Logs</em> </strong></p>
<p>If you have a specific business need to ensure that the robots are following your rules, (or you’re just paranoid) then you should not simply rely on the tools they provide to test compliance. You’re going to need to go straight to the horse’s mouth and analyze your web server logs to see exactly what they are doing. There is no one easy tool for doing this, you’ll likely have to use an existing tool like one of these (<a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07">Microsoft HTTP Log Parser</a>) or write your own. It isn’t difficult, it will simply take some time to implement. A useful reference for this is a list of all the robot <a href="http://www.robotstxt.org/db.html">user agents</a>, and more complete list of bots from <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40364">Google</a>, and <a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx">Microsoft</a>.</p>
<p><a name="verify"></a><em><strong>Verifying Robot Identity</strong></em></p>
<p>Another thing you’ll likely want to consider in this endeavor is to validate that the robots are who they actually say they are. Google, Yahoo and Microsoft all support <a href="http://en.wikipedia.org/wiki/Reverse_DNS_lookup">Reverse DNS authentication</a> of their robots. The process is pretty simple and described here by <a href="http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html">Google</a>, <a href="http://www.ysearchblog.com/archives/000460.html">Yahoo </a>and <a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx">Microsoft</a>, essentially you simply find out what range their robot’s DNS is hosted in, and use that in your tool. This way, if the address changes (which it will), you don’t need to update your code.</p>
<p>Should you find any issues, where one of the robots are not minding the REP, or are misbehaving in some other way, you can always communicate directly with each engine through one of their forums:</p>
<ul>
<li><a href="http://groups.google.com/group/Google_Webmaster_Help-Indexing/topics">Google Crawling, Indexing and Ranking Forum</a></li>
<li><a href="http://help.yahoo.com/l/us/yahoo/search/search_support.html">Yahoo Crawler Feedback Form</a></li>
<li><a href="http://forums.microsoft.com/webmaster/ShowForum.aspx?ForumID=1984&amp;SiteID=79">Microsoft Crawler Error and Feedback Forum</a></li>
</ul>
<h2><a name="removal"></a>Removing Content From Search Engine Indices</h2>
<p>If you find that you haven’t implemented the techniques described here correctly and private content from your site is indexed, each of the major search engines has methods available for requesting that it be removed. For more information, see:</p>
<ul>
<li><a href="http://googlewebmastercentral.blogspot.com/2007/04/requesting-removal-of-content-from-our.html">Google: Requesting removal of content from our index</a></li>
<li><a href="http://help.yahoo.com/l/us/yahoo/search/siteexplorer/delete/">Yahoo!: Deleting URLs</a></li>
<li><a href="https://support.live.com/eform.aspx?productKey=wlsearch&amp;page=wlsupport_home_options_form_byemail&amp;ct=eformts">Live Search: Requesting content removal</a></li>
</ul>
<h2><a name="Additional_Resources:_"></a>Additional Resources:</h2>
<ul>
<li>Google
<ul>
<li><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40362">How to create a robots.txt file</a></li>
<li><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40364">Descriptions of each user-agent that Google uses</a></li>
<li><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40367">How to use pattern matching</a></li>
<li><a href="http://www.google.com/support/webmasters/bin/answer.py?answer=40368">How often we recrawl your robots.txt file</a></li>
<li><a href="http://googlewebmastercentral.blogspot.com/2006/08/all-about-googlebot.html">All about Googlebot</a></li>
</ul>
</li>
<li>Yahoo!
<ul>
<li><a href="http://www.ysearchblog.com/archives/000372.html">Wild card support</a></li>
<li><a href="http://www.ysearchblog.com/archives/000508.html">X-Robots tag directive support</a></li>
</ul>
</li>
<li>Microsoft Live Search
<ul>
<li><a href="http://blogs.msdn.com/livesearch/archive/2006/11/29/search-robots-in-disguise.aspx">Search robots in disguise</a></li>
</ul>
</li>
<li>Other resources
<ul>
<li><a href="http://searchengineland.com/070305-204850.php">Search Engine Land: Meta Robots Tag 101</a></li>
<li><a href="http://searchengineland.com/080603-121100.php">Search Engine Land: Yahoo!, Microsoft, Google Clarify Robots.txt Support</a></li>
<li><a href="http://searchengineland.com/070417-213813.php">Search Engine Land: URL Removal Options</a></li>
<li><a href="http://www.robotstxt.org/">robotstxt.org</a></li>
<li><a href="http://en.wikipedia.org/wiki/Robots.txt">Wikipedia: Robots Exclusion Standard</a></li>
</ul>
</li>
</ul>
<h3>Revision History</h3>
<ul>
<li>02/12/2009 – Google, Yahoo and Microsoft make a joint announcement of the rel=’Canonical’ tag to make it easier for publishers to specify the canonical URLs.</li>
<li>06/04/2009 – Added NOPREVIEW tag announced this week by Microsoft. Used to disable the ‘hover preview’ feature on their SERP.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/managing-search-engine-access-to-your-site/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Where&#8217;s Nathan?</title>
		<link>http://nathanbuggia.com/posts/wheres-nathan/</link>
		<comments>http://nathanbuggia.com/posts/wheres-nathan/#comments</comments>
		<pubDate>Mon, 08 Dec 2008 16:22:46 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">/blog/post/Wheres-Nathan.aspx</guid>
		<description><![CDATA[<p>It has been quite a while since I&#8217;ve last updated this blog. It doesn&#8217;t mean that I have stopped writing, just that I&#8217;ve been writing other places. For SEO, I will mostly continue to post articles on the <a href="http://blogs.msdn.com/webmaster">Live Search Webmaster blog</a>, for developer related content (and <a href="http://en.oreilly.com/found">O&#8217;Reilly Found</a> conference information) will continue [...]]]></description>
			<content:encoded><![CDATA[<p>It has been quite a while since I&#8217;ve last updated this blog. It doesn&#8217;t mean that I have stopped writing, just that I&#8217;ve been writing other places. For SEO, I will mostly continue to post articles on the <a href="http://blogs.msdn.com/webmaster">Live Search Webmaster blog</a>, for developer related content (and <a href="http://en.oreilly.com/found">O&#8217;Reilly Found</a> conference information) will continue to post on <a href="http://janeandrobot.com/">Jane and Robot</a>, and finally for financial insight I&#8217;ll be writing on <a href="http://meridianpacific.org">Meridian Pacific Investments</a> blog.</p>
<p>I&#8217;ve been really busy lately with Microsoft and the O&#8217;Reilly conference, but I still have a few additional side projects that I&#8217;ll be introducing in the next couple months. Stay tuned. Here&#8217;s a few of the articles I&#8217;ve published recently:</p>
<ul>
<li><a href="http://janeandrobot.com/post/URL-Referrer-Tracking.aspx">URL Referrer Tracking</a> &#8211; overview of the various implementations available for tracking where your traffic is coming from in a search-friendly way.</li>
<li><a href="http://blogs.msdn.com/webmaster/archive/2008/10/20/smx-east-2008-ajax-css-and-web-2-0.aspx">SMX East: AJAX, CSS and Web 2.0</a> &#8211; a strategic look at implementation options for building search-friendly Web 2.0 applications.</li>
<li><a href="http://blogs.msdn.com/webmaster/archive/2008/10/15/smx-east-2008-unraveling-urls-and-demystifying-domains.aspx">SMX East: Unraveling URLs and Demystifying Domains</a> &#8211; top 7 issues we at Live Search see with URLs and how to avoid them</li>
<li><a href="http://blogs.msdn.com/webmaster/archive/2008/10/13/smx-east-2008-webmaster-guidelines.aspx">SMX East: Webmaster Guidelines</a> &#8211; overview of the search engine guidelines for indexing and background information on why they exist.</li>
</ul>
<p>I will continue to update this blog as time permits, but I don&#8217;t write very much that isn&#8217;t finance or SEO related these days. Hopefully as I dig deeper into some of my other projects I&#8217;ll have something useful to write.</p>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/wheres-nathan/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Web 2.0 Expo: Advanced SEO for Developers</title>
		<link>http://nathanbuggia.com/posts/web-2-0-expo-advanced-seo-for-developers/</link>
		<comments>http://nathanbuggia.com/posts/web-2-0-expo-advanced-seo-for-developers/#comments</comments>
		<pubDate>Thu, 18 Sep 2008 17:19:00 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[Search & SEO]]></category>

		<guid isPermaLink="false">/blog/post/Web-20-Expo-Advanced-SEO-for-Developers.aspx</guid>
		<description><![CDATA[<p> I would like to thank everyone for taking the time to check out my session at Web 2.0 Expo here in New York City. I&#39;ve posted my presentation and answers to all the online questions on the <a href="http://blogs.msdn.com/webmaster/archive/2008/09/24/web-2-0-expo-seo-for-web-development-presentation.aspx#comments">Webmaster Center</a> Blog at Live Search. </p> Q&#38;A &#8211; <a href="http://blogs.msdn.com/webmaster/archive/2008/09/24/web-2-0-expo-seo-for-web-development-presentation.aspx">http://blogs.msdn.com/webmaster/archive/2008/09/24/web-2-0-expo-seo-for-web-development-presentation.aspx</a> Presentation &#8211; <a href="http://nathanbuggia.com/wp-content/files/Web_20_NYC_2008.pptx">http://nathanbuggia.com/wp-content/files/Web_20_NYC_2008.pptx</a>]]></description>
			<content:encoded><![CDATA[<p>
I would like to thank everyone for taking the time to check out my session at Web 2.0 Expo here in New York City. I&#39;ve posted my presentation and answers to all the online questions on the <a href="http://blogs.msdn.com/webmaster/archive/2008/09/24/web-2-0-expo-seo-for-web-development-presentation.aspx#comments">Webmaster Center</a> Blog at Live Search.
</p>
<ul>
<li>Q&amp;A &#8211; <a href="http://blogs.msdn.com/webmaster/archive/2008/09/24/web-2-0-expo-seo-for-web-development-presentation.aspx">http://blogs.msdn.com/webmaster/archive/2008/09/24/web-2-0-expo-seo-for-web-development-presentation.aspx</a></li>
<li>Presentation &#8211; <a href="http://nathanbuggia.com/wp-content/files/Web_20_NYC_2008.pptx">http://nathanbuggia.com/wp-content/files/Web_20_NYC_2008.pptx</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/web-2-0-expo-advanced-seo-for-developers/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Import a CSV File into SQL Server</title>
		<link>http://nathanbuggia.com/posts/import-a-csv-file-into-sql-server/</link>
		<comments>http://nathanbuggia.com/posts/import-a-csv-file-into-sql-server/#comments</comments>
		<pubDate>Thu, 04 Sep 2008 08:49:00 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">/blog/post/Import-a-CSV-File-into-SQL-Server-BULK-INSERT.aspx</guid>
		<description><![CDATA[<p>Metrics aggregation and reporting have always seemed to be a part of every job I&#8217;ve had. Over the years I&#8217;ve developed a system that allows me to slice and dice just about anything using Excel, SQL and a little bit of code. I used to rely heavily on the Data Transformation Services in SQL 2000 [...]]]></description>
			<content:encoded><![CDATA[<p>Metrics aggregation and reporting have always seemed to be a part of every job I&#8217;ve had. Over the years I&#8217;ve developed a system that allows me to slice and dice just about anything using Excel, SQL and a little bit of code. I used to rely heavily on the Data Transformation Services in SQL 2000 Enterprise Console, and haven&#8217;t really found a good replacement (read: free replacement) until today. I just came across this little snippet of SQL that does the trick very well, here&#8217;s what you do:</p>
<p>[more]</p>
<h2>1. Create a new table in your database</h2>
<p>Create a new table in your database, making sure each column data type is compatible with the corresponding column in your CSV file.</p>
<p><a href="http://seatteboulders.com/wp-content/uploads/2008/09/import-csv-file-1.png"><img class="alignnone size-full wp-image-119" title="import-csv-file-1" src="http://seatteboulders.com/wp-content/uploads/2008/09/import-csv-file-1.png" alt="" width="350" height="170" /></a></p>
<h2>2. Properly format your input CSV file</h2>
<p>What every data you want to suck in should be in a standard <a href="http://en.wikipedia.org/wiki/Comma-separated_values" target="_blank">CSV file format</a> as such. Save the file in a conspicuous location like <strong>c:\</strong>.</p>
<p style="text-align: left;"><a href="http://seatteboulders.com/wp-content/uploads/2008/09/import-csv-file-2.png"><img class="alignnone size-full wp-image-120" title="import-csv-file-2" src="http://seatteboulders.com/wp-content/uploads/2008/09/import-csv-file-2.png" alt="" width="436" height="217" /></a></p>
<h2>3. Run this script</h2>
<p>Finally, execute the following script on your SQL Server. It should locate the CSV file, and import all the rows. Note, if it encounters an error on any single row, it will simply exclude that row in the resulting table. That could be a bit of a problem if you&#8217;ve got a lot of data.</p>
<p><a href="http://seatteboulders.com/wp-content/uploads/2008/09/import-csv-file-3.png"><img class="alignnone size-full wp-image-122" title="import-csv-file-3" src="http://seatteboulders.com/wp-content/uploads/2008/09/import-csv-file-3.png" alt="" width="335" height="197" /></a></p>
<p>This script seems to work in SQL Server 2005 and 2008. For more information, check out MSDN&#8217;s reference material, there seem to be a lot more bells and whistles than I&#8217;m using here in this simple example. <a title="http://msdn.microsoft.com/en-us/library/ms188365.aspx" href="http://msdn.microsoft.com/en-us/library/ms188365.aspx">http://msdn.microsoft.com/en-us/library/ms188365.aspx</a></p>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/import-a-csv-file-into-sql-server/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>TechEd Developer 2008: Advanced SEO for Web Development</title>
		<link>http://nathanbuggia.com/posts/teched-developer-2008-advanced-seo-for-web-development/</link>
		<comments>http://nathanbuggia.com/posts/teched-developer-2008-advanced-seo-for-web-development/#comments</comments>
		<pubDate>Fri, 06 Jun 2008 20:05:00 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[Search & SEO]]></category>

		<guid isPermaLink="false">/blog/post/TechEd-Developer-2008-Advanced-SEO-for-Web-Development.aspx</guid>
		<description><![CDATA[<p>I would like to thank everyone for attending my session at this year&#8217;s <a href="http://www.microsoft.com/events/teched2008/developer/default.mspx" target="_blank">TechEd Developer</a> conference in Orlando, FL. Below I&#8217;ve included a link to my slides, please feel free to contact me if you have any questions. This is very similar to the <a href="http://nathanbuggia.com/post/Mix08-Presentation-Advanced-SEO-for-Developers.aspx" target="_blank">Mix08 Advanced SEO Presentation</a> I gave a [...]]]></description>
			<content:encoded><![CDATA[<p>I would like to thank everyone for attending my session at this year&#8217;s <a href="http://www.microsoft.com/events/teched2008/developer/default.mspx" target="_blank">TechEd Developer</a> conference in Orlando, FL. Below I&#8217;ve included a link to my slides, please feel free to contact me if you have any questions. This is very similar to the <a href="http://nathanbuggia.com/post/Mix08-Presentation-Advanced-SEO-for-Developers.aspx" target="_blank">Mix08 Advanced SEO Presentation</a> I gave a few months ago at Mix08, except I expanded the &#8220;Diagnosing SEO issues within your site&#8221; section. For a deeper dive on the subject, check out the 3 hour <a href="http://janeandrobot.com/admin/Pages/web20presentations.html" target="_blank">SEO for Web Development</a> workshop <a href="http://www.ninebyblue.com/about/" target="_blank">Vanessa Fox</a> and I presented on this topic at Web 2.0.</p>
<p>I also recommend you check out the <a href="http://webmaster.live.com/" target="_blank">Live Search Webmaster Center</a> for information about how Live Search is crawling your site, news and discussion forums. Have I done enough shameless plugging yet?</p>
<ul>
<li><a rel="enclosure" href="http://nathanbuggia.com/wp-content/files/Advanced-SEO-Web-Development-TechEd-2008.pptx">Advanced-SEO-Web-Development-TechEd-2008.pptx (5.31 mb)</a></li>
</ul>
<div id="__ss_451317" style="width: 425px; text-align: left;">
<div style="margin: 0px;"><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowfullscreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://static.slideshare.net/swf/ssplayer2.swf?doc=advancedseowebdevelopmentteched2008-1212751043637245-8" /><embed type="application/x-shockwave-flash" width="425" height="355" src="http://static.slideshare.net/swf/ssplayer2.swf?doc=advancedseowebdevelopmentteched2008-1212751043637245-8" allowscriptaccess="always" allowfullscreen="true"></embed></object></div>
<div style="font-size: 11px; font-family: tahoma,arial; height: 26px; padding-top: 2px;"><a href="http://www.slideshare.net/?src=embed"><img style="border: 0px none; margin-bottom: -5px;" src="http://static.slideshare.net/swf/logo_embd.png" alt="SlideShare" /></a> | <a title="View Advanced Seo Web Development Tech Ed 2008 on SlideShare" href="http://www.slideshare.net/nbuggia/advanced-seo-web-development-tech-ed-2008?src=embed">View</a> | <a href="http://www.slideshare.net/upload?src=embed">Upload your own</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/teched-developer-2008-advanced-seo-for-web-development/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Seattle (free) Networking Events for Developers</title>
		<link>http://nathanbuggia.com/posts/seattle-free-networking-events-for-developers/</link>
		<comments>http://nathanbuggia.com/posts/seattle-free-networking-events-for-developers/#comments</comments>
		<pubDate>Sat, 10 May 2008 05:22:00 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[Search & SEO]]></category>

		<guid isPermaLink="false">/blog/post/Seattle-(free)-Networking-Events-for-Developers.aspx</guid>
		<description><![CDATA[<p> Are you in the local seattle area and interested in the technical side of search engine optimization? E.g. Implementation/ operational best practices, design patterns, site reviews, etc? Then you should come check out one of the upcoming events being hosted by Jane &#38; Robot in May. Each event will have a couple 15 minute [...]]]></description>
			<content:encoded><![CDATA[<p>
Are you in the local seattle area and interested in the technical side of search engine optimization? E.g. Implementation/ operational best practices, design patterns, site reviews, etc? Then you should come check out one of the upcoming events being hosted by Jane &amp; Robot in May. Each event will have a couple 15 minute talks by a local expert, time for Q&amp;A, and then a couple in depth site reviews. (not to mention free beer and snacks!)
</p>
<h2>Tuesday, May 13th @ 6pm</h2>
<p>
<a href="http://www.solo-bar.com/">Solo Bar</a>, 200 Roy Street, Seattle<br />
Sponsored by Microsoft, so they&rsquo;ll be providing lots of swag in addition to food and drinks. <a href="http://www.ninebyblue.com/about/">Vanessa Fox</a> will be talking about how a search engine works, and someone from Microsoft will talk about ASP.Net and Silverlight best practices. Then we&#39;ll do the Q&amp;A, site reviews and have general networking time.
</p>
<p>
Sign up at Upcoming.org: <a href="http://upcoming.yahoo.com/event/594441">http://upcoming.yahoo.com/event/594441</a>&nbsp;
</p>
<h2>Thursday, May 29th @ 6pm</h2>
<p>
<a href="http://maps.live.com/default.aspx?v=2&amp;FORM=LMLTCP&amp;cp=ry79sx4t3qzm&amp;style=b&amp;lvl=1&amp;tilt=-90&amp;dir=0&amp;alt=-1000&amp;scene=3694119&amp;phx=0&amp;phy=0&amp;phscl=1&amp;sp=Point.ry79sx4t3qzm_Google's%20Office____&amp;encType=1">Google Seattle office</a>, 651 N 34th St. Seattle<br />
Sponsored by Google, so expect a lot of primary-colored furniture. For this session we&#39;ll focus a bit more on diagnosing issues with your site, and look out for a few interesting guest speakers.
</p>
<p>
Sign up at Upcoming.org: <a href="http://upcoming.yahoo.com/event/622645">http://upcoming.yahoo.com/event/622645</a>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/seattle-free-networking-events-for-developers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Startup Metrics 101 &#8211; Dave McClure&#8217;s Talk at Web 2.0</title>
		<link>http://nathanbuggia.com/posts/startup-metrics-101-dave-mcclures-talk-at-web-2-0/</link>
		<comments>http://nathanbuggia.com/posts/startup-metrics-101-dave-mcclures-talk-at-web-2-0/#comments</comments>
		<pubDate>Wed, 23 Apr 2008 06:24:00 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">/blog/post/Startup-Metrics-101-Dave-McClures-Talk-at-Web-20-Talk.aspx</guid>
		<description><![CDATA[<p> Or better known as the &#34;AARRR!&#34; method for running a startup (Acquisition, Activation, Retention, Referral, Revenue). It turns out that this methodology is just as useful for established businesses as startups. </p> <p></p> <a href="http://www.slideshare.net/?src=embed"></a> &#124; <a href="http://www.slideshare.net/dmc500hats/startup-metrics-101-367863" title="View this slideshow on SlideShare">View</a> &#124; <a href="http://www.slideshare.net/upload">Upload your own</a>]]></description>
			<content:encoded><![CDATA[<p>
Or better known as the &quot;AARRR!&quot; method for running a startup (Acquisition, Activation, Retention, Referral, Revenue). It turns out that this methodology is just as useful for established businesses as startups.
</p>
<p></p>
<div id="__ss_367863" style="width: 425px; text-align: center">
<div style="margin: 0px">
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" width="425" height="355"><param name="width" value="425" /><param name="height" value="355" /><param name="allowfullscreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://static.slideshare.net/swf/ssplayer2.swf?doc=startupmetrics101aarrrpdf-1208898881815937-9" /><embed type="application/x-shockwave-flash" width="425" height="355" allowfullscreen="true" allowscriptaccess="always" src="http://static.slideshare.net/swf/ssplayer2.swf?doc=startupmetrics101aarrrpdf-1208898881815937-9"></embed></object>
</div>
<div style="font-size: 11px; font-family: tahoma,arial; height: 26px; padding-top: 2px">
<a href="http://www.slideshare.net/?src=embed"><img style="border: 0px none ; margin-bottom: -5px" src="http://static.slideshare.net/swf/logo_embd.png" alt="SlideShare" /></a> | <a href="http://www.slideshare.net/dmc500hats/startup-metrics-101-367863" title="View this slideshow on SlideShare">View</a> | <a href="http://www.slideshare.net/upload">Upload your own</a>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/startup-metrics-101-dave-mcclures-talk-at-web-2-0/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An SEO Conference Just for Developers (Seattle)</title>
		<link>http://nathanbuggia.com/posts/an-seo-conference-just-for-developers-seattle/</link>
		<comments>http://nathanbuggia.com/posts/an-seo-conference-just-for-developers-seattle/#comments</comments>
		<pubDate>Tue, 08 Apr 2008 09:32:00 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[Search & SEO]]></category>

		<guid isPermaLink="false">/blog/post/Seattle-SEO-Conference-for-Developers-SMX-Advanced.aspx</guid>
		<description><![CDATA[<p> Did you know that search engines drive nearly 30% of all page views to web sites? And that&#39;s just an average, the number can be much higher for smaller sites, and ecommerce sites. That&#39;s a lot of customers, possibly even more than the number of firefox users coming to your site. That begs the [...]]]></description>
			<content:encoded><![CDATA[<p>
Did you know that search engines drive nearly 30% of all page views to web sites? And that&#39;s just an average, the number can be much higher for smaller sites, and ecommerce sites. That&#39;s a lot of customers, possibly even more than the number of firefox users coming to your site. That begs the question &#8211; do you spend as much time testing your site for search as you do for FireFox?
</p>
<p>
Over the past year I&#39;ve done several talks for web developers on SEO, covering how search engines work and some best practices, and I have been really surprised how low the awareness has been. Everyone seems interested, but there tend to be a lot of misconceptions about SEO (e.g. <em>&quot;I want to build my site for people not robots&quot;</em>, or <em>&quot;SEO? No, I&#39;m not a spammer&quot;</em>). Well, to help put those misconceptions aside, I&#39;ve been working with <a href="http://www.ninebyblue.com/about/">Vanessa Fox</a> at the SMX conference series to create a 1-day event on <a href="http://searchmarketingexpo.com/advanced/2008/developer-day.php">SEO for web developers</a>. Here&#39;s the agenda:
</p>
<p>[more] </p>
<ul>
<li><strong>Search Friendly Development</strong><br />
	Highlights the most important elements to consider for search engine optimization (SEO) when building a web application infrastructure and provides tactical details about how to implement those elements. Topics include: </li>
</ul>
<blockquote>
<ul>
<li>Developing a crawlable infrastructure </li>
<li>Considerations when developing rich internet applications (using technologies such as Flash, Silverlight, and AJAX) </li>
<li>URL rewriting, redirection, canonicalization, and visitor tracking </li>
</ul>
</blockquote>
<ul>
<li><strong>Platform Considerations for the Microsoft Stack and LAMP Stack</strong><br />
	Practical tips, tricks, and workarounds for search-friendly architecture. </li>
</ul>
<blockquote>
<ul>
<li>Microsoft Stack (IIS, ASP.Net, Silverlight) </li>
<li>LAMP Stack (Apache, PHP, Ruby, Flash/Flex) </li>
<li>CMS Platforms (e.g. BlogEngine.net, AxCMS, WordPress, Movable Type, Drupal, Joomla) </li>
</ul>
</blockquote>
<ul>
<li><strong>Diagnosing Web Site Architecture Issues</strong><br />
	Provides a checklist and workflow for diagnosing your web sites for SEO obstacles using freely available diagnostic tools.</li>
</ul>
<ul>
<li><strong>Expert Technical Review of Your Website</strong><br />
	This session will bring together experts who will use all of the information and tactics learned throughout the day and apply them to detailed site reviews of the code and infrastructure of sites submitted in advance by the audience.</li>
</ul>
<p>
Sign up here if you&#39;re interested in <a href="http://searchmarketingexpo.com/speaker-form.php">becoming a speaker</a> at this event! Otherwise, if you&#39;re located in Seattle, you can <a href="http://searchmarketingexpo.com/advanced/2008/register.php">register here</a>.
</p>
<p>
We&#39;ll also be doing several free events leading up to this, stay tuned for more information on one of those as well.&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://nathanbuggia.com/posts/an-seo-conference-just-for-developers-seattle/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

