![]() |
Search Engine question
I'm just curious about whether or not pages with server-side code in them rank as well as vanilla html pages. I know that when a spider hits an index.asp it doesn't see the asp code, just the html after the includes are inserted. BUT does the search engine take into account the fact that it IS asp and rank it lower?
|
Well, Google doesn't rank them differently but Google doesn't like all too many parameters attached to a url... especially not session ones...
If you're worried have .html run as .php or .asp.... |
You won't rank lower due to a page type of extension, however you are correct in stating that search engines can only index the static html code of your pages.
|
Quote:
The page you see in the browser includes the html from the footer file but it doesn't see the asp include statement. I don't use any variables in the urls so I'm thinking that that page will qualify as static content. But I just wanted to make absolutely sure, hehehe. Since there is no penalty for the file extension it seems that I should be in the clear. I could really have used SSI and made it a .shtml page but we have asp on the server so I figured I might as well use that to save having to change stuff individually on 20 pages, hehe. Thanks for your input. |
Just don't [ .. ! include... the headar content, the search engines needs the head tags static in order to read the title and metas. Anything else on the page has to be in the static CODE to be read by SE's. SE's cannot spider includes or the content within them yet.
|
Quote:
At any rate I'm guessing that using includes to pull the footer and the news portion of a page would probably not seriously affect a ranking even if the spider doesn't see it. |
Quote:
|
The search engines see's what code is on your page when you upload it to the server, not what you see when you View source on the page from the web.
If you code is all includes, you have no chance of ranking that site unless you can plug in some static content, either hidden in code or visible on the actual page. Trust me I know, I've been battling it out with a Cold Fusion site that is 100% includes and html content plugins where the only pages I can get ranked are the static promotional pages. What you see from the web has no bearing on what the SE see's. An example is http://SmutDrs.com - every peice of code on that site is from an include. Not one bit of static content. Can you tell? Only the webmaster truly knows. SE's cannot spider or index included content yet. They are just now becoming able to spider and index flash movies however :) Google can do links and AllTheWeb.com and indes links and actual text in the flash movie :) |
Coldfusion may be different from PHP und ASP....
In PHP when you call a file then it will get parsed by the PHP engine and only html output will be submitted to the external caller... I'll just make an all include php file myself and then I'll see... Well, google did index that: http://216.239.53.104/search?q=cache...hl=en&ie=UTF-8 http://216.239.53.104/search?q=cache...hl=en&ie=UTF-8 |
Quote:
Quote:
Quote:
Quote:
Quote:
If you want to learn more about search engine optimization, check out: http://www.searchenginewatch.com http://www.webmasterworld.com http://www.searchengineforums.com One last time, do NOT hide text. Its like begging to be banned. |
Hmmm... seems to be a bit of a disagreement on that... lol. I'm pretty much in the camp that says the spider can't see anything that the server doesn't send to it. So if it asks for a page with includes the server is going to put the content into the page and THEN send it to the spider. Basically there are 3 ways to access a file on a webserver. HTTP, FTP and SSH/Telnet. FTP and SSH/Telnet require a password and username in most cases and HTTP doesn't see any difference between Joe Surfer and Moe Spider. And a SE spider doesn't use FTP or other protocols because it is spidering WEB content. If a spider can see what is actually on the server as opposed to what is served by HTTP then there would be a HUGE security hole. That would mean that someone could write a script to emulate a spider and download all the .htaccess and other files on a server.
|
Just for the hell of it I did the following search on Google:
http://www.google.ca/search?q=eromod...-8&hl=en&meta= There is one result which is: http://www.prettyteenmovies.com/EromodelCash/Index.asp The quoted text is: ... 06-01-03 Eromodel Cash We are pleased and proud to announce that we have signed June 2003 Penthouse Pet Lanny Barbie to be a part of our online family! ... That text comes from the news scroller. And it is called via: !--#include file="news.htm"-- The text isn't visible anyplace else on the page. So it seems spiders certainly do NOT have any problems with server side includes. At least not those in asp. |
Differences of opinion are great. But you cannot say i'm wrong on this point:
"100% wrong. A search engine sees exactly what you see when you view the source of your page. There's no way for a spider to see the "uploaded" version of the page. A search engine's spider makes the same type of HTTP request as a browser, so it sees exactly the same thing. " You misunderstand. A search engine see's what you see when you view source on a page when you are looking at it from the server and not from the web. "I don't think your problem is includes. " View source of http://SmutDrs.com - every peice of code on that site is from an include, there is 0 static content on the page, including the head and foot tags. I know it is includes and cfm plugins, I have seen the source. In order to optimize the head tags they had to optimize the content in the included file, so although you can see it from the web, when you look at the page on the server, all you see it the include tags, and NOT the actual head code content. The reason why they are the same on every page is because ITS PULLING FROM THE SAME INCLUDE HEADER FILE! If it seems stuffed it's because they tried to manually intergrate the metas on top of the includes. The only pages they can get this site to rank are doorways and non-dymanic pages. Cold Fusion has alot more problems than PHP and ASP, for sure! I did not suggest to abuse hidden code, but to rather integrate indexable content relevant to your site as best into your indexable code as possible. Ideally you do not want to use includes to manage and supply textual content. My testing over the past 3 months has shown that SE's are still unable to spider includes as content for the source page they are indexing for ranking. Includes can be useful when use for grahical content, this lightens the source code so your actual text content stands out which can help your rankings.. - however with SmutDrs.com there is NO static text or HTML on the site whatsoever, no matter how it appears when viewed from the web. Many sites use both, just optmize the html static content around your includes and you should do great! Appreciate your imput NetRodent, however I have been doing full time SEO for the past 5 years and have worked with all kinds of sites, designs, and programming languages. I would not mislead someone or suggest something that I did not learn from experience. A difference of opinion is always welcome :D Best wishes, Cyndalie |
Quote:
On the servers you can even have Lynx call a php or asp or cfm page then dump it to a temp file and then view the temp file. all you see in there is static html and no includes.... As soon as there is a http request and the http server is configured to parse files then they will be parsed.... As pointed out by MisterX before, if you don't do a http request (e.g. ftp or ssh) then you will get the originial file with all includes... even wget will result in a "static" html page. |
I know that you can use browser includes. That's done quite often with Javascript and in that case it's the browser that puts things together and not the server. Certainly in that case a spider wouldn't be getting all the code because it isn't set up the same as a browser. But there are pretty big differences between browser and server includes.
|
Quote:
Quote:
|
"How do you look on the server at the file? " First you download the file to your drive (via FTP) and view it in a notepad.
Ok this is kinda hard to explain.. Have you ever viewed a .cfm or .whatever file in a text pad BEFORE it is uploaded to the server or viewed from the web? Cold fusion is the worst with this.. In a text pad, not an HTML editor. I can't type exact code or it can mess up the board but the actual page source of smutdrs look like this: cfinclude template = " app_globals.cfm " from head to foot. They had to manually put in metas at the top of the page, even though one of the template includes plugs it in- there even are no open and close html tags even in the actual code of the page. Because it was in a include and engines read the HTML code, if they did not do it manually the engines had nothing to read or index the site on. BUT when you view it from the web, all content is plugged in so what you see is NOT what you necessarily have to work with when optimizing dymanic sites. Since engines are robots and not humans calling a page, they can read what the programming code looks like - this used to be applied to cloaking several years ago. Show the engine one thing and the user another - but by IP, here it's because dymanic content reacts when it's CALLED, not necessarily read. This is why when the code is cached and viewed in a brower, it even looks normal as well. Most sites consist of both HTML and whatever dynamic programming language, and they can actually work FOR you, however if you have a say when working with a programmer, make sure they are not making 100% of your site dynamically generated it it will become optimization hell. I haven't run across this much, and look for it before committing to new site for optimization. WSJB78 you had some great points I'm going to delve further into. I know I'm not 100% right, but this is what I have seen and worked with and it can be frustrating, since every language and site is different. |
MisterX you're right exactly right, great application for that purpose! Thanks for the imput :)
"What I don't really understand is why they felt it was a good idea to use includes to pull the actual head section. Some kind of bizarre logic must be at work there." I wondered that myself. They said something like they built it so they can update the site via an interface rather than hire a webmaster. Paying the price now I'm afriad.... |
Small Recap:
To better understand includes I often think of them as iframes. I think of them as pulling content OFF your page rather than plugging it in LOL A page does not have to have a dynamic URL to have dynamic code. Includes used for graphical purposes such as headars and menus can be beneficial since it makes the actual text on the page more visible to the search engine, reducing the lines of code it has to index and better allowing your chances for a full spidering. Keeps your indexable code CLEAN. However, if you INCLUDE you navigation, make sure you use footer text links or have a sitemap link from every page that links all pages together. |
Quote:
Quote:
I brought up your title and meta tags, not to discuss how they were included in the final html document but to suggest that possibly you weren't ranking because you repeat the same words over and over and it looks spammy. There is no technical or moral reason for a search engine not to rank a page made with cold fusion. However, there is a very good reason not to rank a page that has spammy headers and little body text. Quote:
Quote:
Quote:
see exactly the same thing a browser sees. They issue the exact same http requests. Search Engines do not have a "magic" way of seeing anything other than what the web server sends to them. Look through your log files and compare the byte count of a page served to a spider with the byte count of a page served to a regular user. Unless your includes spit out different results depending the time of day, the ip address or useragent of the client, or some other random element, the byte counts will be exactly the same. Quote:
|
Quote:
If you want to try it for yourself, open a plain connection to your webserver on port 80 and issue the HTTP commands by hand. Just make sure you hit enter twice after the last line: GET / HTTP/1.1 Host: www.smutdrs.com User-Agent: Googlebot/2.1 (+http://www.googlebot.com/bot.html) You'll see exactly the same thing a spider will see. If you want to play around more with direct connections, read up on the HTTP specifications: http://www.faqs.org/rfcs/rfc2616.html Quote:
|
NetRodent:
I completely agree with you... dynamic pages are not worse for SEs as long as you don't have all too many parameters on it. "New" approaches are to turn those parameters into "directory" path with the ReWrite function. However that does slow down the overall performance... |
Quote:
ScriptAlias /dynamic/ /web/dynamic/handler.pl Then a request such as: http://www.domain.com/dynamic/param1...m3/param4.html Would call the script /web/dynamic/handler.pl with the PATH_INFO environment variable set to /param1/param2/param3/param4.html |
All times are GMT -4. The time now is 06:25 PM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
2013 - xnations.com