Hi,

I've created a script to understand the logic of caching. Do you think my example below makes sense? I am asking because I've never used caching before.

1. Check if HTML version of current testfile.php exist in cache folder as testfile.html
a. If YES then compare content of testpage.php with testpage.html
i. If content is same then print from HTML. DONE!
ii.If not then print content of testpage.php and create testpage.html to be used in future. DONE!
b. If NO then print content of testpage.php and create testpage.html to be used in future

Note: I don't think this example is compatible with dynamicaly generated webpages. If I am correct then could you please help me to enhance this example?

Thanks

testpage.php

<?php
//**********************************************************************
//** CACHE CLASS *******************************************************
//**********************************************************************
class CacheBase{
	public function createCacheFile($pageNameIN, $contentIN){
		// Create copy of php file and save as html in cache/ folder
		$newFile = fopen("cache/".$pageNameIN.".html", "w");
		fwrite($newFile, $contentIN);
		fclose($newFile);
	}
	
	public function checkContentBased($pageNameIN, $contentIN){
		// Get the content of existing html file
		$existingContent = file_get_contents("cache/".$pageNameIN.".html");
		
		// Return relevant values based on result.
		// True = Overwrite existing file with new content
		// $existingContent = Don't overwrite existing file just print
		if($contentIN == $existingContent){
			return $existingContent;
		} else {
			return true;
		}
	}
}
//**********************************************************************
//** END ***************************************************************
//**********************************************************************


// Start output buffering to catch all content from 
ob_start();

$body = "<html>\n";
$body.= "<head>\n";
$body.= "<title>Static caching</title>\n";
$body.= "</head>\n";
$body.= "<body>\n";
$body.= "<p>This testpage.php/html cache file generated automaticaly</p>\n";
$body.= "</body>\n";
$body.= "</html>\n";

echo $body;

// Save all the content from webpage into a valiable
$content = ob_get_contents();

// Clear the buffer not to submit to the client
ob_end_clean();


//**********************************************************************
//** CACHE CONTROL *****************************************************
//**********************************************************************

// Initiate Cache class
$objCache = new CacheBase();
// Get the name of the file from URL into a variable
$pageName = testfile;

// If the copy of requested file doesn't exist in cache folder as html
// then create it and print it otherwise continue further checks
if(! file_exists("cache/".$pageName.".html")){
	// Print PHP content for the first time as it will be the same 
	// as new HTML copy just will be created below
	echo $content;
	
	$objCache->createCacheFile($pageName, $content);

} else {
	// $overWrite = either true or copy of HTML file from cache
	$overWrite = $objCache->checkContentBased($pageName, $content);
	
	// If new content differs from existing one then overwrite html file
	// stored in cache folder otherwise use old HTML copy from cache
	if($overWrite === true){
		// Print PHP content for the first time as it will be the same 
		// as new HTML copy just will be created below
		echo $content;
	
		$objCache->createCacheFile($pageName, $content);
		
	} else {
		// Old cache HTML file is being used
		echo $overWrite;
	}
}
//**********************************************************************
//** END ***************************************************************
//**********************************************************************
?>

This is not caching. You're actually causing your script to do more work when it tries to serve from cache than when you're just generating the content dynamically.

The way caching should work, is when your script runs, it attempts to load a cache, via unique identifier (key, hash, filename, etc). When that cache request fails, because it is expired or it does not exist, then it proceeds to execute the code that will be saved into the cache, saves to the cache and displays what it generated.

This is hopefully just to give you an idea of what you might expect, if you want a more appropriate OOP solution, you'd probably want a Cache Interface that multiple cache adapters (file, db, mongo, memcache, etc) can implement so you have a consistent api. Perhaps even all tucked away behind a cache factory that handles creating the different cache instances based on a supplied adapter.

<?php

class Cache
{
	protected $_key;
	
	public function __construct( array $settings = array() )
	{
		//Do something with your settings
		//Set maximum lifetime?
		//Set directory path?
	}
	
	public function load( $key )
	{
		$this->_key = $key;
		
		if( $this->isValid( $key ) )
		{
			$cache = file_get_content('some/path/to/some/file.cache');
			return $cache
		}
		
		return false;
	}
	
	public function save( $cacheable )
	{
		//saves the generated content somewhere
		//maybe check for object and serialize automatically (setting?)
	}
	
	public function isValid( $key )
	{
		//Validate the cache exists and is not expired etc.
		//return true or false
	}
	
	public function expire( $key )
	{
		//Attempts to destroy the cache identified by key
	}
}

Your usage would be something like:

$cache = new Cache();

//Cache::load() returns false when it gets a miss
if( ( $content = $cache->load('my_unique_identifier') ) === false ){
	//Do process intensive tasks and save generated output.
	$loops = 10000;
	$content = '';
	for($i = 0; $i < $loops; $i++ ){
		$content .= 'Some text to fill the array'.PHP_EOL;
	}

	//Cache already knows the key to save this because of the call to Cache::load()
	$cache->save($content);
}

//We either get this from cache or we get this from the loop that generates the 10000 lines
echo $content;

Hopefully this gets you more on track with what you're hoping to achieve. There are LOTS of valid ways to do this. The ultimate goal is to eliminate an area of intensive work by storing the generate results and bypassing the intensive piece of code on subsequent loads.

commented: Nice +8
Member Avatar for diafol

Sorry to jump in here - not trying to hijack, honest. Cache seems to be a black box for me.

This expiry thing. It's set manually by the max lifetime? If data is updated, and later the code runs again and the cached version is used (within max lifetime), no display of updated data right? So, how long should the max lifetime be?

I realise that this is a 'how long is a piece of rope' question, but I'm a little confused.

Cache expiration time entirely depends upon what is being cached. Without a particular use case there isn't really a "standard".

Member Avatar for diafol

OK. Thanks, thought it was a daft question. :(
Thinks I have to do some more research on this.

mschroader thanks for the example. As I am new with this caching stuff, I don't know how to fill the gaps in functions of your Cache class.

I went through some examples in Google but the comments against the codes are 50-50 good and bad so I cannot decide which one to go for as I haven't got enough knowledge about it. I need a working example so, by relying on your knowledge and experience, could you please suggest me some examples please?

Thanks

Many years know I read in forums that “PHP isn’t for cashing” but my tests are pointing in other way. It depends on what you are cashing, in what server are you and what logic of cashing are you in to. Since we (PHP programmer’s community) embraced OOP I believe that cashing will be one key point of programming. (Take a look at http://www.daniweb.com/web-development/php/threads/353823 )

-veledrom

My suggestion to you, if you want to learn how to code well then you should be looking at well coded examples.

Symfony's File Cache (Many others in their package as well):
http://www.symfony-project.org/api/1_4/sfFileCache
Examples:
http://snippets.symfony-project.org/snippet/110
http://snippets.symfony-project.org/snippet/99

Zend Cache
http://framework.zend.com/manual/en/zend.cache.html
Code:
http://framework.zend.com/svn/framework/standard/branches/release-1.11/library/Zend/Cache.php
http://framework.zend.com/svn/framework/standard/branches/release-1.11/library/Zend/Cache/

CakePHP Cache
http://api.cakephp.org/class/cache

You'll see they are all very different but generally similar to one another. I would use these to try and put something of your own together as they'll illustrate lots of best practices and be well documented.

-ardav

Caching is kind of an open ended question. In my experiences the places where you think you need cache are usually not the places that benchmark poorly. Rule of thumb would be to cache external resources, e.g. if you're displaying daniweb's rss feed you wouldn't use file_get_contents, simplexml, dom, etc. to parse it every time a page loads you'd call that only if your local cache didn't exist and since you're caching that you might as well cache the rendered results before their displayed as they won't be changing either. Then you're doing almost no actual work to display that remote feed until the cache is expired, say every hour or two would be practical.

Some areas that I've also seen improved with caching would be large static objects. Like a site wide ACL. Where unless you're making changes to the acl it is static, so generate the object once, serialize it and save it to a file or your database etc. Then you alleviate the need to generate a heavy object on every request.

My caution though is to avoid over-caching. It is easy to see the immediate benefits of caching output and the increase in speed of the site, but it can cause really severe issues when it comes to testing or troubleshooting a bug. e.g. You fix a bug in your code you load up the page to make sure everything is still working, unit testing would greatly reduce the need for this, and you get the cached results that make everything look like it is working correctly. An hour later you've moved on to other tasks, your client calls up and says you stupid !#$@#$!@# my website is broken!

-jkon

Personally I feel like dynamic languages, like php, have a lot to gain from caching, especially since there is no persistence beyond the request for objects and the like.

However, I did check out the link to the other thread you posted and I have to disagree with you on caching of objects. Seldom have I run into an object that is so costly to create that it needs to be cached for an extended period of time, unless like I said earlier it is pretty static, like an ACL or a Configuration.

I think dependency injection is a much better solution to the same problem where over the life of the request you prevent duplicate objects from being created any more than one time. I do agree with you that caching should be the result of tests and benchmarks that have determined it is necessary to cache a particular aspect of your code.

As far as where caching is best to occur this depends a lot on the architecture of your code. Me personally I like working with Service layers in my MVC applications. Where my business and application logic is implemented in the service layer, which in turns interacts with application models which are mapped to data stores and data access objects with data mapper classes. Because you're responsibilities are so separated you end up with excellent points to cache almost any aspect of your system, anywhere, at any time.

Before I go through those links you have given I let you know why I am asking all these question. Database will probably give up or slow down If I don't use caching feature as page calls will send queries to database everytime pages called by users. I want to prevent this.

What really in my mind is this.

page1.php will expire every 5 minutes even though database gets updated.
page2.php will expire every time if database gets updated.
page3.php will never be cached.

Note: Content of every page are formed by the data coming from the database.

Member Avatar for diafol

@ veledrom: I'm really sorry, I seem to have taken focus away from your original point. Won't do it again - promise.
@ MS - great explanation, as always, thanks.

-veledrom

I'm not sure I really understand the utility in caching the way you have described it. How have you arrived at the conclusion the database will slow down or give up if you don't use caching?

Sounds to me like you're trying to over optimize your code before you see where the code is going to bottleneck in the first place.

To be honest I read it somewhere in a website that's why I came out with such statement.

OK let's make things clear for me. You have a dynamic website and it always gets updated by users and admin etc. When would you use caching and what would be the logic.

Sorry I am not testing you, just trying to clear my view of caching. Sometimes reading many bits from the Net creates confusion to people like me.

Thanks to everyone though for contributions so far.

Unless you are benchmarking your code and load testing it for these kind of slow downs while you're developing, you're really not going to gain much by adding caching for the sake of adding caching. Caching is part of a scaling plan, and you can't tell where your code needs to scale until you know where you code fails and why.

The functionality you described is exactly what every dynamic database driven piece of code does. These slides might be of interest http://www.slideshare.net/JustinCarmony/effectice-caching-w-php-caching. Although the slides talk about memcache specifically the concepts they cover are relevant to all forms of cache storage.

Sounds like we're coming to the end of this post.

I don't mind not using caching at all so shall I just ignore it then? If no, is there anything that would be little helpful for me to use as standard, something little and simple like setting headers etc? Probably something little is better than nothing.

Thanks

-mschroeder

What are you cashing and why … In my point of view you are cashing objects (or objects that are lists of objects) that are more expensive to retrieve in other ways than cashing or the output. In most cases the “expensive” one is the first. If you built clean your model you have an object (which might be a list of objects) as output for any input and the Controller decides which one to use. There are many MVC implementations and although I don’t consider most of them clean it has to do with each person programming background and work. There are many patterns that I am not acquainted with (like Pull-based) and I can’t really express opinion other than my quick view. In a traditional Push-based MVC logic SOA (Service Oriented Architecture) is a part of model… it doesn’t matter how will you expose the results of logic (WS AJAX HTML EJB or anything) this is the part of a Controller. I understand that there could be many other views in this and are more than welcomed, to be honest what bothers me is working in a company embracing other logic than main stream without full comments or tutorial, naming it “SOA” or even “agile” and thinks that is in top of technology.

You have a dynamic website and it always gets updated by users and admin etc. When would you use caching and what would be the logic.

Who updates what is the question. If all the content is updated also by admin and users then there is no reason of cashing. (Exept that you receive a lot more visits in a content that is not changing in that period of time and for that you should consider cashing for scaling). If admin updates other parts of “page” and users other than you should consider dividing logic and use cashing for first one. As I stated before I am not claiming infallibility but this is just my point of view. Hoped I helped..

Ok I look at caching objects a bit. Thank you very much for all your inputs. At least I learned what to focus on now.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.