This article describes how including a search engine for your website. We will use extensions indexed_search (already preinstalled with TYPO3) and
crawler for indexing content and extension
macina_searchbox for the search mask that will be put on each page.
Summary of this article:
PLEASE NOTE: You should consider reading a much more elaborated version of this article I wrote more recently: Indexed Search & Crawler - The missing manual.
Extension crawler is used to manage cron jobs. It allows us to choose what should be indexed and when to perform this job. This is useful as out-of-the-box, indexed_search will index content (pages and associated documents) as visitors are showing your website; that’s not very fair with them as it requires additional time to “render” the page.
Let’s install these two extensions:

Next step is to configure the extension indexed_search. Following screenshot shows options to be configured. We will need a few tools:
The other important configuration options are to deactive document indexing from frontend, as we will configure cron jobs for this task, and to specify that external files (PDF, …) should be indexed with an additional process, not the same as the related web page.

Before creating the cron job, we will configure our website to be index-ready. In the Setup part of our template, let’s add:
page.config.index_enable = 1 page.config.index_externals = 1
Ensuite, dans la partie pageTS de notre page d’accueil (respectivement la racine de notre site), ajoutons
tx_crawler.crawlerCfg.paramSets { tt_content = &L=[0-1] tt_content.procInstrFilter = tx_indexedsearch_reindex # if extension cachemgm is available too: # ... = tx_indexedsearch_reindex, tx_cachemgm_recache tt_content.baseUrl = http://www.domain.tld/ }
Using mode Web > List, we should create a new record of type Indexing configuration at our website root (homepage). Now choose type page tree and select the root page of the website. The indexing depth may be set to 1 if all of our pages are accessible from the root page (that’s often the case). Now save the configuration and check that it’s active (a red question mark on the configuration icon means that the configuration is hidden and thus that it is deactived.
Now click on Web > Info, select the root page and then in information screen, choose crawler in the drop-down list.
Check that we will perform a site crawling on all sublevels (infinite), select the Re-indexing processing instruction, click on Update and then Crawl URLs. Our site will then start to be indexed. We may check indexing status if we choose Indexed search in the drop down list of the information screen.

Pour que l’extension crawler fonctionne, il faut créer dans Tools > User Admin un nouvel utilisateur nommé _cli_lowlevel. Le mot de passe n’importe pas et ses droits d’accès non plus.
That’s it! We still just have to create the cron job:
* * * * * www-data php /path/to/typo3/cli_dispatch.phpsh crawler
Let’s start creating a page that will contain the search result list. In page property, we should choose to hide it in menu as we will shortly add a search mask to each and every page of our website. Now we may add plugin indexed_search as content element:

In order to create the search mask itself, we may install extension
macina_searchbox just as we did for the two extensions related to content indexing. Then, in our templateTS, we have to add code below in order to include the plugin:
plugin.tx_macinasearchbox_pi1 { pidSearchpage = 12 templateFile = fileadmin/templates/search_template.html } lib.searchbox < plugin.tx_macinasearchbox_pi1
Parameter pidSearchpage is the ID of the page containing the plugin indexed_search. Parameter templateFile is set per default to file EXT:macina_searchbox/pi1/template.htm. You may create a copy of it locally and customize it according to your needs.
You may read
the documentation of the extension indexed_search if you wish to gain more control over the search results and the indexing options.
