注册 | 登录
收藏 | 帮助
热门文章
编辑推荐
相关文章  
快速有效地封杀—巧利用Iris来查
走近 WSH(Windows Scripting Hos
快速有效地封杀—巧利用Iris来查
用Norton Internet Security拦截
安全配置Norton Security2004
Microsoft AntiSpyware微软出品的
透明防火墙架设的完全攻略(brid
DNS 系统设定例--4.named.boot 的
DNS 系统设定例--5.named.boot 的
qmail在Linux,Solaris系统安装详
您现在的位置: 顶尖设计 >> IT学院 >> 编程开发 >> PHP >> 文章正文
Script Caching with PHP
作者:Nir Yariv  来源:不详  点击:  更新:2006-12-19
简介:

Intended Audience
Introduction
The Caching Imperative
The Script Caching Solution
The Caching Script
Implementation: Avoiding Common Pitfalls
Summary
The Script
About the Author

Intended Audience
This article is intended for the PHP programmer interested in creating a static HTML cache of dynamic PHP scripts. The article has been written specifically for an Apache server running PHP scripts, but the ideas described here are applicable to almost any Web environment.

The article assumes that you have some experience with creating dynamic Web sites and that you are familiar with HTTP – at least enough to know what a "404 Page Not Found" error means and the definition of the environment variables $REQUEST_URI and $DOCUMENT_ROOT.

Introduction
The benefits to using dynamic Web pages are well known, but there are nonetheless two significant drawbacks: speed and search engine accessibility.

Speed: The speed in which a user receives a page after clicking a link or entering a URL is a crucial factor for a Website. It depends on dozens of variables, some of which you may have control over and some of which you don’t. There are countless bottlenecks in the process, and it’s probably impossible to fix them all. This bottleneck we will tackle here is the one caused by waiting for the server side scripts to create the HTML output.


Search Engine Accessibility: By this I mean the ability of search engines to point to a particular Web page. Most search engines function by using a "Crawler" program. Crawler programs begin on a certain page and navigate through the links on it. Every page a crawler visits is then indexed on the search engine’s database.
Most crawlers, however, are only programmed to navigate through static (HTML) pages – not dynamic ones. So, for example, pages with URLs that contain a "?" character (indicating a query string) or a filename ending with ".php" will not be accessed. Consequently, crawlers will not index these pages, making your site less accessible to new visitors.


Note: A crawler cannot tell the difference between an HTML file’s output and a PHP file’s. They both send the same content type. Therefore, most crawlers simply decide according to the filename and/or if there is a query string in the URL – that is, if the URL contains a "?".

This article discusses a procedure for dealing with both of these drawbacks. The article’s script should be sufficient for use under most circumstances – but in particular, small scale Web sites and individual script pages that are only moderately subject to change (dynamics).

The Caching Imperative
Simply speaking, caching entails storing the output of one or more dynamic scripts into static HTML files. A visitor to your site would be directed to these HTML files rather than to their original dynamic versions.

The mechanism for doing so can be described using a Magazine’s Web site as an example.

A Magazine’s Web site would likely have a database that contained numerous articles and stories. You would normally have a script (say "show_article.php") that:

Receives an article ID number
Reads the article’s content from the database
Puts it into some kind of HTML template
Formats the whole page with navigation links etc...
Sends the resulting HTML to the visitor’s browser
As such, in the site’s homepage you might have links to current articles coded as follows:

<a href="show_article.php?id=123">Cache Article</a>
Now, articles tend to be static and you would hope that the site was operating under heavy request loads (because it’s popular!!). Consequently, requests for each article would undergo extensive processing – meaning access database, search article, and display it.

Moreover, when you depend on other database information such as layout specifications, then the process would take even longer. Lastly, a search engine’s crawler would not even index the content of your article(s) because the link to the article page contains a "?" and a ".php" extension, and thus the crawler would not follow it.

Therefore, to alleviate these problems a Webmaster should at least consider implementing some form of caching system.

When You Should Cache a Script
While the caching solution presented in this article will be beneficial to many users, there will be circumstances when you will prefer not to cache your scripts at all or use a different caching method.

Scripts that must deal with frequently changing data such as stock values, discussion forums or process forms are not fit for the system described in this article. Under these cases, the decision is up to you – you might decide to leave them dynamic or you might opt for a more advanced solution such as using the Zend Cache.

Note: Using the Zend Cache for your site caching needs would render the system described in this article totally unnecessary (though you might still want to read it in order to improve your PHP skills !). The Zend Cache provides you with a complete turnkey caching solution. For a complex site I would advise buying it (and I’m not just saying this because this is Zend’s site but because the application is both easier to maintain and is well supported.

On the other hand, if your site only features a few basic scripts, then you probably do not need to bother with caching at all.

Nonetheless, if you:

Feature (at least relatively) complex scripts on your site,


Wish to be able to handle numerous page hits,
and/or

Cannot afford the cost of a commercial caching solution,
then I hope this caching mechanism will serve you well.

For pages that do not need to be kept up to the minute, the speed of this system cannot be beaten since it creates pure static HTML pages.

The Script Caching Solution
The standard caching system solution is to generate static HTML files. From the earlier example, then, the link to the cache article will now be coded as follows:

<a href="/cache/show_article/id_123.html">Cache Article</a>
id_123.html contains the output generated by the show_article.php script when it is called using id=123.

It is a good practice to store all of the cached files under a single directory of their own (in the above example, it was the "/cache" directory) with sub-directories named for each creating dynamic script (i.e. "show_article/" directory).

In this manner, the cached files are separated from the dynamic scripts, making site maintenance that much easier to manage – for example, you can easily perform actions such as deleting old cached files generated by a certain script. More importantly, however, it simplifies cache.php’s string replacement mechanism. For more details, refer to cache.php details.

Be aware that links to your dynamic pages will need to be switched to point to their respective HTML scripts (output).

So, if you would want article #123 to be cached, for example, you would simply change the link from "show_article.php?id=123" to "cache/show_article/id_123.html".

Note: The HTML files do not have to be defined before assigning these new links. A script is not cached until it has been called by the Server.

Furthermore, since the HTML files will reside under a different URL, any relative paths from within those files (e.g. "http://www.myserver.com/path/to/images/art.gif") will need to be corrected. Therefore, consider working with absolute paths such as "http://www.myserver.com/path/to/images/art.gif" or "/path/to/images/art.gif" – note the preceding "/" , meaning relative to the current server .

Alternatively, you can add a <BASE HREF="http://www.myserver.com/"> tag to your HTML <head> section.

Note: It is NOT recommended that you change the paths to relative paths from the cache directory (such as "../../path/to/images/art.gif"). This is because the whole point of this caching system is that files may or may not be cached according to your preferences. You will want to have the links working whether the HTML is read from a cached file (under the /cache/ directory) or from the dynamic script (in some other directory); Absolute URLs guarantee this.

The Caching Script
Central to the caching system is the caching script, itself (cache.php). It reads the dynamic scripts by using fopen(<dynamic script URL>) as if it was a browser. It generates the output and then saves this output to a static HTML file, after having displayed it to the user.

cache.php, itself, only uses basic PHP. It can also function independently of any other script. Consequently, you do not need to modify any existing scripts in order to implement script caching.

Activating the Caching Script
The recommended method for activating cache.php is to do so by way of the " 404 page not found" event, thereby automating its execution and minimizing its impact on the site.

The "404 Page Not Found" error informs the visitor that the server could not find his/her requested page. Most of the time a standard "Page Not Found" page is displayed. However, since most Web servers enable you to customize your error pages, you can call the cache.php script when a file is not found in place of displaying the default "Page Not Found" page.

For example, in Apache, you can edit your configuration file (httpd.conf and located in the "apache/conf/" directory) by adding the following statement :

ErrorDocument 404 /cache.php
This statement assigns responsibility for handling a 404 error to the cache.php script. Apache will call this script when a file is not found in place of the default "Page Not Found" page.

Warning: Be sure that a copy of the original configuration file is saved before changing it. It is always a good idea to k

[1] [2] [3] 下一页






  • 上一篇文章:
  • 下一篇文章:
  • 分享此文:该页面添加到 Mister Wong 添加到雅虎Yahoo!收藏 Add to:Del.icio.us Post to Furl Digg this 添加到Google书签 reddit spurl blogmarks 365Key 评论  收藏  分享  打印
     我来说两句
    姓名:       验证码:   
    主页: 
    评分: 1分 2分 3分 4分 5分
    本频道近期热评文章:
      关于我们 | 联系我们 | 站点地图 | 广告投放 | 友情链接 | 在线留言 | 版权申明
    版权所有 © 2004-2007 顶尖设计(bobd.cn)
    未经授权禁止转载,摘编,复制本站内容或建立镜像. 沪ICP备07504942号 
    网络110
    报警服务