Neo4jQueryReader
QueryResult ¤
Bases: TypedDict
Query execution outputs for the Neo4jQueryReader
component.
Source code in src/neo4j_haystack/components/neo4j_query_reader.py
Neo4jQueryReader ¤
A component for reading arbitrary data from Neo4j database using plain Cypher query.
This component gives flexible way to read data from Neo4j by running custom Cypher query along with query
parameters. Query parameters can be supplied in a pipeline from other components (or pipeline inputs).
You could use such queries to read data from Neo4j to enhance your RAG pipelines. For example a
prompt to LLM can produce Cypher query based on given context and then Neo4jQueryReader
can be used to run the
query and extract results. OutputAdapter component might
become handy in such scenarios - it can be used as a connection from the Neo4jQueryReader
to convert (transform)
results accordingly.
Note
Please consider data types mappings in Cypher query when working with query parameters. Neo4j Python Driver handles type conversions/mappings. Specifically you can figure out in the documentation of the driver how to work with temporal types.
from neo4j_haystack.client.neo4j_client import Neo4jClientConfig
from neo4j_haystack.components.neo4j_query_reader import Neo4jQueryReader
client_config = Neo4jClientConfig("bolt://localhost:7687", database="neo4j", username="neo4j", password="passw0rd")
reader = Neo4jQueryReader(client_config=client_config, runtime_parameters=["year"])
# Get all documents with "year"=2020 and return "name" and "embedding" attributes for each found record
result = reader.run(
query=("MATCH (doc:`Document`) WHERE doc.year=$year RETURN doc.name as name, doc.embedding as embedding"),
year=2020,
)
Output
>>> {'records': [{'name': 'name_0', 'embedding': [...]}, {'name': 'name_1', 'embedding': [...]}, {'name': 'name_2', 'embedding': [...]}], 'first_record': {'name': 'name_0', 'embedding': [...]}}
The above result contains the following output:
records
- A list of dictionaries, will have all the records returned by Cypher query. You can control record outputs as per your needs. For example an aggregation function could be used to return a single result. In such case there will be one record in therecords
list.first_record
- In case therecords
contains just one item,first_record
will have the first record from the list (put simply, first_record=records[0]). It was introduced as a syntax convenience.
If your Cypher query produces an error (e.g. invalid syntax) you could use that in Loop-Based Auto-Correction
pipelines to ask LLM to auto correct the query based on the error message, afterwards run the query again.
from neo4j_haystack.client.neo4j_client import Neo4jClientConfig
from neo4j_haystack.components.neo4j_query_reader import Neo4jQueryReader
client_config = Neo4jClientConfig("bolt://localhost:7687", database="neo4j", username="neo4j", password="passw0rd")
reader = Neo4jQueryReader(client_config=client_config, raise_on_failure=False)
# Intentionally introduce error in Cypher query (see "RETURN_")
result = reader.run(
query=("MATCH (doc:`Document` {name: $name}) RETURN_ doc.name as name, doc.year as year"),
parameters={"name": "name_1"},
)
Output
>>> {'error_message': 'Invalid input 'RETURN_'...', 'error': <Exception>}
The error_message
output can be used in your pipeline to deal with Cypher query error (e.g. auto correction)
When configuring Query parameters for Neo4jQueryReader
component, consider the following:
- Parameters can be provided at the component creation time, see
parameters
- In RAG pipeline runtime parameters could be connected from other components.
Make sure during creation time to specify which
runtime_parameters
are expected.
Important
At the moment parameters support simple data types, dictionaries and python dataclasses (which can be converted
to dict
). For example haystack.ChatMessage
instance is a valid query parameter input. If you supply custom
classes as query parameters, e.g. Neo4jQueryReader(client_config=client_config).run(parameters={"obj": <instance of custom class>})
it will
result in error. In such rare cases query_parameters_marshaller
attribute can be used to provide a
custom marshaller implementation for the type being used as query parameter value.
Source code in src/neo4j_haystack/components/neo4j_query_reader.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 |
|
__init__ ¤
__init__(
client_config: Neo4jClientConfig,
query: Optional[str] = None,
runtime_parameters: Optional[List[str]] = None,
verify_connectivity: Optional[bool] = False,
raise_on_failure: bool = False,
query_parameters_marshaller: Optional[QueryParametersMarshaller] = None,
)
Parameters:
-
client_config
(Neo4jClientConfig
) –Neo4j client configuration to connect to database (e.g. credentials and connection settings).
-
query
(Optional[str]
, default:None
) –Optional Cypher query if known at component creation time. If
None
should be provided as component input. -
runtime_parameters
(Optional[List[str]]
, default:None
) –list of input parameters/slots for connecting components in a pipeline.
-
verify_connectivity
(Optional[bool]
, default:False
) –If
True
will verify connectivity with Neo4j database configured byclient_config
. -
raise_on_failure
(bool
, default:False
) –If
True
raises an exception if it fails to execute given Cypher query. -
query_parameters_marshaller
(Optional[QueryParametersMarshaller]
, default:None
) –Marshaller responsible for converting query parameters which can be used in Cypher query, e.g. python dataclasses to be converted to dictionary.
Neo4jQueryParametersMarshaller
is the default marshaller implementation.
Source code in src/neo4j_haystack/components/neo4j_query_reader.py
to_dict ¤
Serialize this component to a dictionary.
Source code in src/neo4j_haystack/components/neo4j_query_reader.py
from_dict
classmethod
¤
Deserialize this component from a dictionary.
Source code in src/neo4j_haystack/components/neo4j_query_reader.py
run ¤
run(
query: Optional[str] = None,
parameters: Optional[Dict[str, Any]] = None,
**kwargs
) -> QueryResult
Runs the arbitrary Cypher query
with parameters
to read data from Neo4j.
Parameters:
-
query
(Optional[str]
, default:None
) –Cypher query to run.
-
parameters
(Optional[Dict[str, Any]]
, default:None
) –Cypher query parameters which can be used as placeholders in the
query
. -
kwargs
–Arbitrary parameters supplied in a pipeline execution from other component's output slots, e.g.
pipeline.connect("year_provider.year_start", "reader.year_start")
, whereyear_start
will be part ofkwargs
.
Returns:
-
Output
(QueryResult
) –Records returned from Cypher query in case request was successful or error message if there was an error during Cypher query execution (
raise_on_failure
should beFalse
).where:
records
- List of records returned (e.g. usingRETURN
statement) by Cypher queryfirst_record
- First record from therecords
list if any
where:
error_message
- Error message returned by Neo4j in case Cypher query is invaliderror
- Original Exception which was triggered by Neo4j (containing theerror_message
)
Source code in src/neo4j_haystack/components/neo4j_query_reader.py
_serialize_parameters ¤
Serializes parameters
into a data structure which can be accepted by Neo4j Python Driver (and a Cypher query
respectively). See Neo4jQueryParametersMarshaller
for more details.