Pascal and Francis Bibliographic Databases

Expert Search

The web page for Expert Search in obtained from a link inside the Advanced Search web page. Put the entire query in the text area called Search Builder. Clicking on the See the results button allows to view the search results. The Search button only allows to display, in the historic link under the Search Builder, the number of items found.

Each search is recorded, an historic of your various queries is saved. You can combine your searches by clicking on the identifier of each query (#n).

From the second query added in the Historic Search Combinatorics field onwards, you have to specify the boolean operator between the queries (OR, AND or NOT). Clicking on the Combine button allows to view the search results.

How to write queries in the search builder?
The application uses the Elasticsearch 5 search engine which offers numerous possibilities: combination of field titles with the boolean or proximity indicators, use of regular expressions, from fuzzy search or relevancy-increasing functionalities.

If the query is written in the builder without specifying a particular field, the search is made on “all fields” of the record. The query is analyzed as a series of terms and operators. A term may be one single word – nuclear – or a sentence, enclosed in double quotes – "nuclear power". The search is then made by testing the presence of all terms and in the listed order, that is the first term then the second one ("term1term2").
Warning: It is recommended to avoid the cut and paste of terms or expressions originating from text editors since this operation could alter the double quotes which, misinterpreted by the search engine, will lead to aberrant results.

1. Use of Field Names
If you wish to search — nuclear or power — in the title the query will be:
ti.\*:(nuclear OR power)
or simply
ti.\*:(nuclear power)
since the default operator linking the terms is “OR”.

To search exactly “Smith John” in the author field, write:
au.\*:"Smith John"

For further information, see: List of Searchable Fields in Expert Search

2. Reserved characters
Some characters are used for the processing of queries by the search engine.
The reserved characters are: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /

If you need to use any of the characters which function as operators in your query itself (and not as operators), then you should escape them with a leading backslash.
For instance, to search for (1+1)=2, you would need to write your query as \(1\+1\)=\2.

Failing to escape these special characters correctly could lead to a syntax error which prevents your query from running.

3. Boolean operators
By default, all terms are optional, as long as one term matches. A search for nuclear power plant will find any document that contains one or more of nuclear or power or plant. The default operator is “OR”. Elasticsearch supports the usual operaors “AND”, “OR” et “NOT”. But their use is less simple than it seems because “NOT” takes precendence on “AND”, which takes precedence on “OR”.

It is possible to enclose groups of terms in brackets. Between these terms, the operator is then the default operator, “OR”.

The use of brackets allows to form subqueries. In the following example:
((nuclear AND plant) OR (power AND plant) OR plant)
the term plant is associated with nuclear or power or with no term.

It is also possible to pull up queries with the “+” (this term must be present) and “-” (this term must not be present) operators. All other terms are optional.
With these operators, the preceding query will be written:
nuclear power +plantreactor
plant must be present, reactor must not be present and nuclear or power or nuclear power are optional.

4. Proximity searches
While a phrase query (eg "nuclear power") expects all of the terms in exactly the same order, a proximity query allows the specified words to be further apart or in a different order.
In the following example:
"clad material"~3
the terms clad and material can be separated by three terms at most. The distance between terms exceeding 2, the order of terms is not maintained. In the results of the query, there might be documents corresponding, for example, to "material from clad".

5. Fuzziness
The fuzzy search operator “~” allows finding similar terms with a maximum of two changes, whether it is insertion, deletion, substitution of a single character, or transposition of two adjoining characters. It operates starting from the third letter of the word.
For example, einstien~ will allow to find records with einstein.

By default the number of changes made in the word is not more than 2, but it is possible to search addition, deletion or substitution possibilities of a single letter. In this case it will be necessary to specify it, as in the following example:
AISI304~1 will allow to find AISI304L, AISI300, et AISI302 type steels.
While a search with:
AISI304~2 will broaden the search by allowing an addition or a substitution of two characters and will allow other steels, as AISI316, AISI321, AISI347, AISI1045 etc…

6. Ranges
Ranges can be specified for date, numeric or string fields. Inclusive ranges are specified with square brackets [min TO max] and exclusive ranges with curly brackets {min TO max}.
For example, in order to search the documents whose dates of publication are before 2012, write:
py.\* : {* TO 2012}
Curly and square brackets can be combined:
py.\* : [2010 TO 2017} will result in the records with date of publication 2010, 2011, 2012, 2013, 2014, 2015 and 2016.
Which can be written with the boolean operators as:
py.\*:(>=2010 AND 2017)
or with the + and – operators as:
py.\* : (+>=2010 +<2017)
An interval whose one limit is not fixed can be written:
py.\*:>2010
py.\*:>=2010
py.\*:<2010
py.\*:<=2010
for the documents with a date of publication respectively subsequent to 2010, or greater than or equal to 2010, prior to 2010 or in the last case lower or equal to 2010.

7. Boosting
This operator, called boost,”^” allows to make one term more relevant than another. For instance, if we want to find all documents about renewable energy sources, but are especially interested in geothermal energy, we will write:
("geothermal energy")^4
The following search strategy:
ti.\*:"renewable energy sources" OR ti.\*:"geothermal energy"^4
The default boost value is 1, but can be any positive floating point number. The values greater than 1 enhance the relevancy of the searched term in the results; while the values between 0 and 1 reduce it. Boosts can also be applied to phrases or to groups.

8. Regular expressions
These are character strings using a definite syntax to describe possible character strings. According to the syntax used, it is possible to repeat zero, once or several times character strings of one word, leading to the definition of a set of words with a close spelling. Regular expressions used in search equations allow to find results despite spelling mistakes and have to be enclosed within slashes, as follows:
au.\*:/joh?n(ath[oa]n)/

For further information, see : Search by Regular Expressions

9. Index and Subfields
Each index used in simple, advanced or expert search has been set with subindexes, in order to make the search easier and allow document retrieval despite, for example, input faults in the query.
Subindexes are rich, stem et raw.
fold separates words of the text, converts them to lowercase and translates each character into its ASCII equivalent.
rich separates words of the text without further processing.
stem separates words of the text and applies to them a stemming processing according to Porter algorithm.
raw: doesn’t make any processing.

Each index includes one or several subindexes. “Standard” searches are carried out on all scheduled subindexes. In Elasticsearch syntax, the star indicates that all subindexes are queried.
In the expert search, for example,
ti.\*: subindexes fold, rich, and stem will be queried because they have been set for the "ti" index.

For further information, see : Index and Subindex List

It is possible to make queries by specifying a particular subindex, provided this has been set for the queried subindex. In this case the Elasticsearch syntax will be, for example, for a search with the « rich » subindex in the title:
ti.rich : NaCl
The use of the « stem » index will allow to retrieve documents with nacl, Nacl or NACL in the title.

To use subindexes with a group of words a search with the exact phrase is required, in order to avoid that the search engine interprets the space between words as a “OR”.
ti.fold : "BWR reactor"