Performing a Search with the API
This is a reference on how to perform a full-text search using the Searchcraft API. This reference is intended for developers that intend to build out their own custom applications or desire to programmatically test search queries. If you are using the SDK components, building search queries is handled for you. All of the provided examples may be run against your Searchcraft cluster, the data_test
index is available on all clusters for query testing. Just make sure to replace the read-key-value
with your key and the searchcraft-cluster-url
with your cluster URL.
API Endpoint Reference
Section titled “API Endpoint Reference”POST /index/:index/search
Returns search results data that match the query criteria.POST /federation/:federation_name/search
Returns search results data across all indices defined in a federation that match the query criteria. Query format is the same as search by index.
Query Modes
Section titled “Query Modes”There are currently two types of search queries that Searchcraft accepts.
fuzzy
This mode is typo-tolerant but may be rank less relevant content further up in the results due to it matching on more than the original term tokens. A good fit for humans as humans are terrible at spelling. Fuzzy utilizes levenshtein distance to match typos within a certains distance of errors from the original term.exact
This is an excellent fit for when you want exact matches only.exact
should always be used on filters.
Queries will run against the fields marked as searchable defaults in the index schema. You do not need to specify the field name in those cases unless you want to restrict the search to just a single field. You may choose specific fields to search against in your query using the query language syntax.
Order_By, Limit, Offset, and Sort Parameters
Section titled “Order_By, Limit, Offset, and Sort Parameters”The order_by, limit, and offset parameters are optional, top level payload parameters.
limit
sets the number of results to return. If you do not set a value the default is 20
. Cannot be greater than the max_result_limit
. For Searchcraft Cloud customers this is set to 200
but if you have a special use case reach out to support.
offset
sets the offset of the first result to return. This can be used to paginate through results.
order_by
by allows you to specify a field to order the results by. This will discard relevance scoring. If you need a combination of relevance and recency its recommended to instead keep the default order with a date value or date range query.
sort
Allows you to change the sort of the results to either ascending or descending. Options are asc
and desc
. If you do not set a value the default is desc
.
Example request
Section titled “Example request”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"limit":2, "offset": 2, "order_by": "title", "sort": "asc", "query":{"fuzzy":{"ctx":"human"}}}' http://<searchcraft-url>/index/data_test/search
Example response
Section titled “Example response”{ "status": 200, "data": { "hits": [ { "doc": { "title": "The Time Traveler's Last Stand", "id": "2", "rating": 6.7, "tags": [ "time travel", "future", "history" ], "body": "In a future where time travel is outlawed, one man races to prevent a catastrophe from altering the course of history.", "reviews": 1973, "category": "/science-fiction/time-travel", "created_at": "2024-08-03T17:09:24Z", "formats": [ "/hardback", "/paperback" ], "active": true }, "document_id": "9581601997514028734", "score": 0.46052486, "source_index": "data_test" }, { "doc": { "tags": [ "rogue planet", "astronauts", "gravity" ], "created_at": "2024-07-24T01:17:08Z", "category": "/science-fiction/space", "reviews": 3912, "formats": [ "/hardback", "/paperback", "/audio-book", "/e-book" ], "active": true, "body": "When a rogue planet enters the solar system, a team of astronauts must save Earth from being pulled into its gravity well.", "rating": 9.3, "id": "6", "title": "The Gravity Well" }, "document_id": "8716278579726351691", "score": 0.46052486, "source_index": "data_test" } ], "count": 26, "time_taken": 0.002419625, "facets": [ { "formats": [ { "path": "/audio-book", "count": 11 }, { "path": "/e-book", "count": 17 }, { "path": "/hardback", "count": 10 }, { "path": "/paperback", "count": 24 } ] }, { "category": [ { "path": "/fantasy", "count": 7, "children": [ { "path": "/fantasy/dark-fantasy", "count": 1 }, { "path": "/fantasy/epic-fantasy", "count": 3 }, { "path": "/fantasy/faerie-tales", "count": 1 }, { "path": "/fantasy/heroic-fantasy", "count": 1 }, { "path": "/fantasy/magic", "count": 1 } ] }, { "path": "/mystery", "count": 8, "children": [ { "path": "/mystery/cold-case", "count": 1 }, { "path": "/mystery/crime", "count": 2 }, { "path": "/mystery/legal", "count": 1 }, { "path": "/mystery/psychological", "count": 1 }, { "path": "/mystery/thriller", "count": 3 } ] }, { "path": "/science-fiction", "count": 11, "children": [ { "path": "/science-fiction/biotechnology", "count": 1 }, { "path": "/science-fiction/cyberpunk", "count": 2 }, { "path": "/science-fiction/space", "count": 2 }, { "path": "/science-fiction/space-colonization", "count": 2 }, { "path": "/science-fiction/space-mystery", "count": 1 }, { "path": "/science-fiction/space-opera", "count": 1 }, { "path": "/science-fiction/time-travel", "count": 2 } ] } ] } ] }}
The hits
object contains the documents within this page of results that matched the query. Page size is determined by the limit
parameter (or the default if it is not set)
If you are making a federated search, source_index
indicates where the document originated from. The score
value is the relevance score of the document. The facets
object is only present on an index that contains facet fields, if not are present it will be omitted. For federated searches, all facets are combined from across the index matches.
Occur Parameter
Section titled “Occur Parameter”The occur parameter is optional and defaults to should
if a value is not provided.
Underneath, the should
and must
settings are used to define the behavior of clauses in Boolean queries, controlling how documents are matched based on specified conditions. Here’s the difference:
should
Clause
Section titled “should Clause”Purpose: Specifies clauses that are optional but influence the relevance scoring.
Behavior:
- Documents that match should clauses are ranked higher in the results.
- If no
must
clause is present, at least oneshould
clause must match for a document to be included in the results. - If
must
clauses are present,should
clauses act as a boost to relevance scoring without being required for matching. Example: Searching for documents where a certain keyword is preferred but not required.
let query = BooleanQuery::new(vec![ (Occur::Should, term_query_1), // Optional but boosts relevance (Occur::Must, term_query_2), // Required match]);
must
Clause
Section titled “must Clause”Purpose: Specifies clauses that are mandatory for a document to be included in the results.
Behavior:
- Documents that do not match a must clause are excluded from the results.
- Used to enforce strict conditions that documents must satisfy.
- Example: Searching for documents where a keyword is required.
let query = BooleanQuery::new(vec![ (Occur::Must, term_query_1), // Mandatory match (Occur::Should, term_query_2), // Optional but boosts relevance]);
Key Differences
Section titled “Key Differences”Aspect | should | must |
---|---|---|
Result Criteria Requirement | Optional (but boosts score) | Mandatory |
Matching | At least one should must match if no must clause is defined | All must clauses must match |
Purpose | Adjusts relevance ranking | Enforces strict matching |
When using a combination of a filter with a string query you typically want to combine must
clause queries. This will function as a logical AND
.
Example must request
Section titled “Example must request”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query":[{"occur":"must","exact":{"ctx":"category: IN [/science-fiction]"}},{"occur":"must","fuzzy":{"ctx":"world"}}],"limit":20}' https://searchcraft-cluster-url/index/data_test/search | jq
The reason this example uses an array of queries instead of a single query is due to the desire to have fuzzy matching on the query term combined with a filter. This forms a logical AND query. If these used occur: "should"
then it would function as a logical OR
query.
By combining should and must clauses you can build complex queries that balance precision (with must
) and recall/relevance (with should
).
Using the Searchcraft Query Language
Section titled “Using the Searchcraft Query Language”Searchcraft’s query language is a powerful tool for crafting complex search queries. It allows you to specify the fields to search, the operator to use, and the value to search for. It’s inspired by Lucene’s query syntax but is Searchcraft does not utilize Lucene.
You can combine search queries against multiple fields using AND or OR operators. You can also do exclusion queries using -.
You can also use IN queries. field:IN [foo bar] will match ‘foo’ or ‘bar’, but nothing else. Range queries are possible using the TO operator.
Refer to the specific query type sections below for more details.
IMPORTANT
Section titled “IMPORTANT”In order to use the query language you need to search in exact
mode. You may combine fuzzy matching with a query language exact query via the API by making a mutiple-query request like so:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": [{ "occur": "must", "exact": {"ctx": "active:false"} }, { "fuzzy": {"ctx": "galaxy"} }]}' https://searchcraft-cluster-url/index/data_test/search
Query Writing Guidelines
Section titled “Query Writing Guidelines”Escape Special Characters
Section titled “Escape Special Characters”Some characters need to be escaped in non quoted terms because they are used as part of the query language syntax. Special reserved characters are:
+ , ^, `, :, {, }, ", [, ], (, ), ~, !, \\, *, SPACE
. If these characters are desired in a query term, they need to be escaped by prefixing them with an back slash \
.
Within quoted terms, the quote character in use '
or "
needs to be escaped.
Datetime Format
Section titled “Datetime Format”Datetime values must be provided in rfc3339 format, such as 1970-01-01T00:00:00Z
or as Unix epoch timestamps 1736367048
.
AND
logical operator
Section titled “AND logical operator”An AND
query will match only if both conditions match.
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "active:false AND rating:>9.0"} }}' https://searchcraft-cluster-url/index/data_test/search
OR
logical operator
Section titled “OR logical operator”An OR
query will match if either conditions match.
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "active:false AND rating:>9.0"} }}' https://searchcraft-cluster-url/index/data_test/search
-
exclusion operator
Section titled “- exclusion operator”Using the -
operator will exclude results that match the term.
{"query": { "exact": {"ctx": "searchterm -excludedterm} }}
Grouping ()
Section titled “Grouping ()”Parentheses are may be used to force the order of evaluation of operators. For instance, if a query should match if ‘field1’ is ‘one’ or ‘two’, and ‘field2’ is ‘three’, you can use (field1:one OR field1:two) AND field2:three.
Operator Precedence
Section titled “Operator Precedence”Without parentheses, AND takes precedence over OR. That is, a AND b OR c is interpreted as (a AND b) or c.
Exclusion operator -
takes precedence over everything, such that -a AND b means (-a) AND b, not -(a AND B).
Field Queries
Section titled “Field Queries”You are not limited to searching across just the default fields. You can also search against specific field values. This is often useful for filtering results or narrowing down the scope of a search.
Field Term Match
Section titled “Field Term Match”Returns results where a field value matches the provided term.
field:term
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "title:\"cybernetic rebellion\""} }}' https://searchcraft-cluster-url/index/data_test/search
This example uses quotes around the phrase because it wants to match the exact phrase “cybernetic rebellion”. You could also send it without quotes and it would match any documents with the word “cybernetic” or “rebellion” in the title.
Field IN
Query
Section titled “Field IN Query”Returns results where a field value is one of the provided terms. The term array can be a list of terms or a single term.
field:IN [term1 term2 term3]
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "tags:IN [evolution colonization]"} }}' https://searchcraft-cluster-url/index/data_test/search
Field TO
Query
Section titled “Field TO Query”Returns results where a value is within a range.
`field:value TO value`
Bounding Range Queries
Section titled “Bounding Range Queries”You can set both inclusive and exclusive bounds when performing a range query. Inclusive bounds are represented by square brackets []. They will match tokens equal to the bound term. Exclusive bounds are represented by curly brackets {}. They will not match tokens equal to the bound term.
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "rating:[6 TO 9]"} }}' https://searchcraft-cluster-url/index/data_test/search
This example will return any documents that have a rating between 6 and 9, including 6 and 9.
Field Value Comparison Queries
Section titled “Field Value Comparison Queries”You can also use the following operators to compare field values.
Operator | Description |
---|---|
> | Greater than |
< | Less than |
>= | Greater than or equal to |
<= | Less than or equal to |
field:>=value
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "reviews:>=3000"} }}' https://searchcraft-cluster-url/index/data_test/search
This example will return any documents that have 3000 or more reviews.
Match All *
Query
Section titled “Match All * Query”Matches every document. Does not require a field name. The match all query is only compatible with exact matching via exact
query mode. You will likely want to specify a limit. It is rare that you will want to use this query.
Example:
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"limit": 100, "query": { "exact": {"ctx": "*"} }}' https://searchcraft-cluster-url/index/data_test/search
Additional Examples Using Query Language Syntax and the data_test
index
Section titled “Additional Examples Using Query Language Syntax and the data_test index”Fuzzy query against default search fields
Section titled “Fuzzy query against default search fields”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query":{"fuzzy":{"ctx":"Wintess"}}}' https://searchcraft-cluster-url/index/data_test/search
Fuzzy query combined with a field filter
Section titled “Fuzzy query combined with a field filter”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": [{ "occur": "must", "exact": {"ctx": "active:false"} }, { "fuzzy": {"ctx": "galaxy"} }]}' https://searchcraft-cluster-url/index/data_test/search
exact query against default search fields
Section titled “exact query against default search fields”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query":{"exact":{"ctx":"Alien"}}}' https://searchcraft-cluster-url/index/data_test/search
Exclusion example
Section titled “Exclusion example”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query":{"exact":{"ctx":"planet -solar"}}}' https://searchcraft-cluster-url/index/data_test/search
By a full category facet
Section titled “By a full category facet”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "category:/science-fiction/cyberpunk"} }}' https://searchcraft-cluster-url/index/data_test/search
By a parent level category facet
Section titled “By a parent level category facet”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "category:/science-fiction"} }}' https://searchcraft-cluster-url/index/data_test/search
Using limit and offset
Section titled “Using limit and offset”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"limit":2, "offset": 2, "query":{"fuzzy":{"ctx":"human"}}}' https://searchcraft-cluster-url/index/data_test/search
Within a group of tags
Section titled “Within a group of tags”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "tags:IN [evolution colonization]"} }}' https://searchcraft-cluster-url/index/data_test/search | jq
By a date greater than
Section titled “By a date greater than”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "created_at:>=2020-06-01T00:00:00Z AND created_at:>=2022-01-01T00:00:00Z"} }}' https://searchcraft-cluster-url/index/data_test/search
By a date range
Section titled “By a date range”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "created_at:[2021-01-01T00:00:00Z TO 2022-12-31T23:59:59Z]"} }}' https://searchcraft-cluster-url/index/data_test/search
boolean queries
Section titled “boolean queries”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "active:false"} }}' https://searchcraft-cluster-url/index/data_test/search
curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "active:true"} }}' https://searchcraft-cluster-url/index/data_test/search
Range query with float field
Section titled “Range query with float field”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "rating:[6 TO 9]"} }}' https://searchcraft-cluster-url/index/data_test/search
Greater than or equal query with integer field
Section titled “Greater than or equal query with integer field”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "reviews:>=3000"} }}' https://searchcraft-cluster-url/index/data_test/search
Float field exact value
Section titled “Float field exact value”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "rating:9.4"} }}' https://searchcraft-cluster-url/index/data_test/search
By an ID value
Section titled “By an ID value”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "id:2"} }}' https://searchcraft-cluster-url/index/data_test/search
Sorting by date in descending order
Section titled “Sorting by date in descending order”curl -X POST -H "Content-Type: application/json" -H "Authorization: read-key-value" --data '{"query": { "exact": {"ctx": "category:/fantasy/epic-fantasy"} }, "sort":"desc", "order_by": "created_at"}' https://searchcraft-cluster-url/index/data_test/search
Tips for better search results
Section titled “Tips for better search results”Remember, your application’s search results are only as good as the information that you give to it. You have full control over the weighting of fields and what field data gets ingested so if you find the results are not to your liking you can always adjust. Not every field that is displayed in the search result documents needs to be searchable and there can always be additional fields added such as keyword fields that you do not display but affect the results.
If your index uses a stopword dictionary any stopwords included in the search term will not afffect the results. For example if you utilize the default en
stopword dictionary and search for “the” without any other search terms the results will be empty.
Federated Search Considerations
Section titled “Federated Search Considerations”When using a fuzzy search, the request will search across the default search fields for each index. If you make an exact search using the query language syntax and specific a field or facet name explicitly, that field must exist in all of the queried indices accross a federation. If it does not, the source index missing the field will be excluded from the results.