Performing a Search with the API
This is a reference on how to perform a full-text search using the Searchcraft API. This reference is intended for developers that intend to build out their own custom applications or desire to programmatically test search queries. If you are using the SDK components, building search queries is handled for you. All of the provided examples may be run against your Searchcraft cluster, the data_test
index is available on all clusters for query testing. Just make sure to replace the read-key-value
with your key and the searchcraft-cluster-url
with your cluster URL.
API Endpoint Reference
POST /index/:index/search
Returns search results data that match the query criteria.
Query Modes
There are currently two types of search queries that Searchcraft accepts.
fuzzy
This mode is typo-tolerant but may be rank less relevant content further up in the results due to it matching on more than the original term tokens. A good fit for humans as humans are terrible at spelling. Fuzzy utilizes levenshtein distance to match typos within a certains distance of errors from the original term.exact
This is an excellent fit for when you want exact matches only.exact
should always be used on filters.
Queries will run against the fields marked as searchable defaults in the index schema. You do not need to specify the field name in those cases unless you want to restrict the search to just a single field. You may choose specific fields to search against in your query using the query language syntax.
Order_By, Limit, Offset, and Sort Parameters
The order_by, limit, and offset parameters are optional, top level payload parameters.
limit
sets the number of results to return. If you do not set a value the default is 20
.
offset
sets the offset of the first result to return. This can be used to paginate through results.
order_by
by allows you to specify a field to order the results by. This will discard relevance scoring. If you need a combination of relevance and recency its recommended to instead keep the default order with a date value or date range query.
sort
Allows you to change the sort of the results to either ascending or descending. Options are asc
and desc
. If you do not set a value the default is desc
.
Example request
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"limit":2, "offset": 2, "order_by": "title", "sort": "asc", "query":{"fuzzy":{"ctx":"human"}}}' http://<searchcraft-url>/index/data_test/search
Occur Parameter
The occur parameter is optional and defaults to should
if a value is not provided.
Underneath, the should
and must
settings are used to define the behavior of clauses in Boolean queries, controlling how documents are matched based on specified conditions. Here’s the difference:
should
Clause
Purpose: Specifies clauses that are optional but influence the relevance scoring.
Behavior:
- Documents that match should clauses are ranked higher in the results.
- If no
must
clause is present, at least oneshould
clause must match for a document to be included in the results. - If
must
clauses are present,should
clauses act as a boost to relevance scoring without being required for matching. Example: Searching for documents where a certain keyword is preferred but not required.
let query = BooleanQuery::new(vec![ (Occur::Should, term_query_1), // Optional but boosts relevance (Occur::Must, term_query_2), // Required match]);
must
Clause
Purpose: Specifies clauses that are mandatory for a document to be included in the results.
Behavior:
- Documents that do not match a must clause are excluded from the results.
- Used to enforce strict conditions that documents must satisfy.
- Example: Searching for documents where a keyword is required.
let query = BooleanQuery::new(vec![ (Occur::Must, term_query_1), // Mandatory match (Occur::Should, term_query_2), // Optional but boosts relevance]);
Key Differences
Aspect | should | must |
---|---|---|
Result Criteria Requirement | Optional (but boosts score) | Mandatory |
Matching | At least one should must match if no must clause is defined | All must clauses must match |
Purpose | Adjusts relevance ranking | Enforces strict matching |
When using a combination of a filter with a string query you typically want to combine must
clause queries. This will function as a logical AND
.
Example must request
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query":[{"occur":"must","exact":{"ctx":"category: IN [/science-fiction]"}},{"occur":"must","fuzzy":{"ctx":"world"}}],"limit":20}' https://searchcraft-cluster-url/index/data_test/search | jq
The reason this example uses an array of queries instead of a single query is due to the desire to have fuzzy matching on the query term combined with a filter. This forms a logical AND query. If these used occur: "should"
then it would function as a logical OR
query.
By combining should and must clauses you can build complex queries that balance precision (with must
) and recall/relevance (with should
).
Using the Searchcraft Query Language
Searchcraft’s query language is a powerful tool for crafting complex search queries. It allows you to specify the fields to search, the operator to use, and the value to search for. It’s inspired by Lucene’s query syntax but is Searchcraft does not utilize Lucene.
You can combine search queries against multiple fields using AND or OR operators. You can also do exclusion queries using -.
You can also use IN queries. field:IN [foo bar] will match ‘foo’ or ‘bar’, but nothing else. Range queries are possible using the TO operator.
Refer to the specific query type sections below for more details.
IMPORTANT
In order to use the query language you need to search in exact
mode. You may combine fuzzy matching with a query language exact query via the API by making a mutiple-query request like so:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": [{ "occur": "must", "exact": {"ctx": "active:false"} }, { "fuzzy": {"ctx": "galaxy"} }]}' https://searchcraft-cluster-url/index/data_test/search
Query Writing Guidelines
Escape Special Characters
Some characters need to be escaped in non quoted terms because they are used as part of the query language syntax. Special reserved characters are:
+ , ^, `, :, {, }, ", [, ], (, ), ~, !, \\, *, SPACE
. If these characters are desired in a query term, they need to be escaped by prefixing them with an back slash \
.
Within quoted terms, the quote character in use '
or "
needs to be escaped.
Datetime Format
Datetime values must be provided in rfc3339 format, such as 1970-01-01T00:00:00Z
or as Unix epoch timestamps 1736367048
.
AND
logical operator
An AND
query will match only if both conditions match.
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "active:false AND rating:>9.0"} }}' https://searchcraft-cluster-url/index/data_test/search
OR
logical operator
An OR
query will match if either conditions match.
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "active:false AND rating:>9.0"} }}' https://searchcraft-cluster-url/index/data_test/search
-
exclusion operator
Using the -
operator will exclude results that match the term.
{"query": { "exact": {"ctx": "searchterm -excludedterm} }}
Grouping ()
Parentheses are may be used to force the order of evaluation of operators. For instance, if a query should match if ‘field1’ is ‘one’ or ‘two’, and ‘field2’ is ‘three’, you can use (field1:one OR field1:two) AND field2:three.
Operator Precedence
Without parentheses, AND takes precedence over OR. That is, a AND b OR c is interpreted as (a AND b) or c.
Exclusion operator -
takes precedence over everything, such that -a AND b means (-a) AND b, not -(a AND B).
Field Queries
You are not limited to searching across just the default fields. You can also search against specific field values. This is often useful for filtering results or narrowing down the scope of a search.
Field Term Match
Returns results where a field value matches the provided term.
field:term
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "title:\"cybernetic rebellion\""} }}' https://searchcraft-cluster-url/index/data_test/search
This example uses quotes around the phrase because it wants to match the exact phrase “cybernetic rebellion”. You could also send it without quotes and it would match any documents with the word “cybernetic” or “rebellion” in the title.
Field IN
Query
Returns results where a field value is one of the provided terms. The term array can be a list of terms or a single term.
field:IN [term1 term2 term3]
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "tags:IN [evolution colonization]"} }}' https://searchcraft-cluster-url/index/data_test/search
Field TO
Query
Returns results where a value is within a range.
`field:value TO value`
Bounding Range Queries
You can set both inclusive and exclusive bounds when performing a range query. Inclusive bounds are represented by square brackets []. They will match tokens equal to the bound term. Exclusive bounds are represented by curly brackets {}. They will not match tokens equal to the bound term.
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "rating:[6 TO 9]"} }}' https://searchcraft-cluster-url/index/data_test/search
This example will return any documents that have a rating between 6 and 9, including 6 and 9.
Field Value Comparison Queries
You can also use the following operators to compare field values.
Operator | Description |
---|---|
> | Greater than |
< | Less than |
>= | Greater than or equal to |
<= | Less than or equal to |
field:>=value
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "reviews:>=3000"} }}' https://searchcraft-cluster-url/index/data_test/search
This example will return any documents that have 3000 or more reviews.
Match All *
Query
Matches every document. Does not require a field name. The match all query is only compatible with exact matching via exact
query mode. You will likely want to specify a limit. It is rare that you will want to use this query.
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"limit": 100, "query": { "exact": {"ctx": "*"} }}' https://searchcraft-cluster-url/index/data_test/search
Additional Examples Using Query Language Syntax and the data_test
index
Fuzzy query against default search fields
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query":{"fuzzy":{"ctx":"Wintess"}}}' https://searchcraft-cluster-url/index/data_test/search
Fuzzy query combined with a field filter
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": [{ "occur": "must", "exact": {"ctx": "active:false"} }, { "fuzzy": {"ctx": "galaxy"} }]}' https://searchcraft-cluster-url/index/data_test/search
exact query against default search fields
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query":{"exact":{"ctx":"Alien"}}}' https://searchcraft-cluster-url/index/data_test/search
Exclusion example
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query":{"exact":{"ctx":"planet -solar"}}}' https://searchcraft-cluster-url/index/data_test/search
By a full category facet
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "category:/science-fiction/cyberpunk"} }}' https://searchcraft-cluster-url/index/data_test/search
By a parent level category facet
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "category:/science-fiction"} }}' https://searchcraft-cluster-url/index/data_test/search
Using limit and offset
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"limit":2, "offset": 2, "query":{"fuzzy":{"ctx":"human"}}}' https://searchcraft-cluster-url/index/data_test/search
Within a group of tags
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "tags:IN [evolution colonization]"} }}' https://searchcraft-cluster-url/index/data_test/search | jq
By a date greater than
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "created_at:>=2020-06-01T00:00:00Z AND created_at:>=2022-01-01T00:00:00Z"} }}' https://searchcraft-cluster-url/index/data_test/search
By a date range
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "created_at:[2021-01-01T00:00:00Z TO 2022-12-31T23:59:59Z]"} }}' https://searchcraft-cluster-url/index/data_test/search
boolean queries
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "active:false"} }}' https://searchcraft-cluster-url/index/data_test/search
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "active:true"} }}' https://searchcraft-cluster-url/index/data_test/search
Range query with float field
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "rating:[6 TO 9]"} }}' https://searchcraft-cluster-url/index/data_test/search
Greater than or equal query with integer field
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "reviews:>=3000"} }}' https://searchcraft-cluster-url/index/data_test/search
Float field exact value
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "rating:9.4"} }}' https://searchcraft-cluster-url/index/data_test/search
By an ID value
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "id:2"} }}' https://searchcraft-cluster-url/index/data_test/search
Sorting by date in descending order
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "category:/fantasy/epic-fantasy"} }, "sort":"desc", "order_by": "created_at"}' https://searchcraft-cluster-url/index/data_test/search
Tips for better search results
Remember, your application’s search results are only as good as the information that you give to it. You have full control over the weighting of fields and what field data gets ingested so if you find the results are not to your liking you can always adjust. Not every field that is displayed in the search result documents needs to be searchable and there can always be additional fields added such as keyword fields that you do not display but affect the results.
Notes
If your index uses a stopword dictionary any stopwords included in the search term will not afffect the results. For example if you utilize the default en
stopword dictionary and search for “the” without any other search terms the results will be empty.