code blog

Concatenate a string, and terminate html within

language: PHP

Concatenate a string, and terminate html within

intent

Use PHP to trim a string of HTML to a set number of characters.

problem

The string contains valid HTML, and if cut, could leave one or more tags unterminated. If string containing unterminated strings is put on the page, it blows up the HTML document.

requirements

  • chop the string
  • scan the string for opened HTML tags
  • append the string with the necessary end tags ("</tag>").

implementation

define and use an object with the ability to:

  1. substr() the input string to length
  2. start a recursive search
  3. scan the string with a php "character at" syntax ($string[$idx])
  4. when a "<TAG>" pattern is found, track "TAG" to "tags to close" array
  5. when a "</TAG>" pattern is found, remove "TAG" from "tags to close" array
  6. end recursive search
  7. foreach element (in reverse order) still in "tags to close" array, close tag by writing "</TAG>" pattern


try it:

concatenate this:


to this length (integer!)



source:


   $killer = new tagEndingConcatMachine ();
   $killer->summaryLength =$_POST['length'];
   $killer->end = "...";
   echo $killer->chop_and_append($_POST['input'])."Read more";


class tagEndingConcatMachine {
	public $end = '...';
	public $summaryLength = 100;
	

	private $tags_to_end = array();

    public function chop_and_append($x){  
        $summary = substr($x, 0 ,$this->summaryLength);
		if($summary !== $x){
			$this->end_tags($summary);		
			return $summary . $this->end;
		}
		return $summary;
    }

    private function end_tags(&$summary){   ;
        for($i = 0; $i<=$this->summaryLength; $i++){
          if($summary[$i]=='<'){
            $this->track_tag($summary, $i);
          }
        }
		for($i = count($this->tags_to_end); $i>=0; $i--){
			if($this->tags_to_end != '' && isset($this->tags_to_end[$i]))
				$this->end .= '</'.$this->tags_to_end[$i].">";
		}
    }

    private function track_tag(&$summary, $i){
		$this_tag = '';
		$endloop = false;
		$ending = false;
		$k = $i+1;
		do{
			$thischar = $summary[$k];
			if($thischar=='/' && $summary[$k-1]== '<'){
				$ending = true;
			}elseif($thischar=='>'){
				if($this_tag!=''){
					if($ending)
						array_pop($this->tags_to_end);
					else
						$this->tags_to_end[] = strtolower($this_tag);
				}
				$endloop = true;
			}else{
				$this_tag .= $thischar;
			}
			$k++;
        }while($k<=$this->summaryLength && !$endloop);
		//TODO: address these issues:
		/* 
		 * if $endloop==false here, and the code is still being
		 * executed (not returned yet), then there is a problem
		 * see Known Issues in blog post
		 */ 
		if(!$endloop){
			if($ending){
			//opened end tag, never closed
				//could be trouble... but tags_to_end knows which to close
				$this->end = '>'.$this->end;
			}else{
			//open opening tag... remove it from the end of the summary
				$summary = substr($summary, 0, strlen($summary)-strlen($this_tag)-1);
			}
		}
    }
}

known issues:

  1. The initial substr() method might chop a tag. For example, the return of substr() ends with: "<di" or "<div" or "</h", etc...
  2. The input HTML string might not be valid HTML, e.g. "<div><h1>invalid html</div></h1>" or" </p><p>what's with the random </p> before this <p>, man?</p>"

Comments

Another solution

This suggestion doesn't really solve your specific problem but is a similar solution I found:
Use strip_tags() to strip of all HTML and then put your own p and/or div tags around the concatenated strip_tags text, e.g.

$length = 55;
$string = "".substr(strip_tags($string), 0, $length)."";

Will work in some implications of the above problem but obviously not all.
Worked for me anyway :)
Cheers

User login