nti.zope_catalog¶
Utilities and extensions for ZODB-based Zope catalogs and indexes.
This builds on both zope.catalog and zc.catalog.
Reference¶
Interfaces¶
Interfaces related to catalogs.
-
interface
nti.zope_catalog.interfaces.
INoAutoIndexEver
[source]¶ Extends:
zope.catalog.interfaces.INoAutoIndex
,zope.catalog.interfaces.INoAutoReindex
Marker interface for objects that should not automatically be added to catalogs when created or modified events fire.
-
interface
nti.zope_catalog.interfaces.
IDeferredCatalog
[source]¶ Extends:
zope.catalog.interfaces.ICatalogQuery
,zope.catalog.interfaces.ICatalogEdit
,zope.container.interfaces.IContainer
Just like
ICatalog
, but a distinct interface to be able to distinguish it at runtime, typically in event subscribers.The use-case is for certain catalogs that want to defer indexing (sometimes to a separate process). Implement this interface instead of
ICatalog
so that the subscriberszope.catalog.catalog.indexDocSubscriber()
,zope.catalog.catalog.unindexDocSubscriber()
, andzope.catalog.catalog.reindexDocSubscriber()
do not find and use this object.To search, you’ll want to look for
ICatalogQuery
which is implemented by bothICatalog
and this object.To find every utility that can support indexing, you can look for
zope.index.interfaces.IInjection
which is also implemented by both interfaces.As a base, instead of extending
Catalog
, you can extendDeferredCatalog
.New in version 2.0.0.
-
__setitem__
(key, value)¶
-
-
nti.zope_catalog.interfaces.
IMetadataCatalog
¶ Backwards compatibility alias.
alias of
nti.zope_catalog.interfaces.IDeferredCatalog
-
nti.zope_catalog.interfaces.
__setitem__
(key, value)¶
-
Catalog¶
Catalog extensions.
-
class
nti.zope_catalog.catalog.
ResultSet
(uids, uidutil, ignore_invalid=False)[source]¶ Bases:
object
Lazily accessed set of objects.
This is just like
zope.catalog.catalog.ResultSet
except it is slower (it has more overhead) and it offers the dubious feature of ignoring broken objects (which is a footgun if ever there was). If you have such objects, your code or deployment is broken.Prefer not to use this class in normal operations (it might be useful for recovery, but even that’s doubtful since it doesn’t track which objects were “invalid”).
-
class
nti.zope_catalog.catalog.
CatalogPrefetchIterator
(iterable, chunk_size)[source]¶ Bases:
object
Given an iterator of
(intid, object)
:- Breaks the iterator into chunks of a given size;
- Detects any persistent objects in the chunk connected to a
jar (this is done by checking for a
_p_jar
); - Groups those objects by jar if needed (supporting multiple databases, because ZODB 5 currently does not correctly do this);
- Asks each jar to prefetch the given objects.
- Finally, iterates over the chunk.
This object is intended to be used with the
_visitSublocations
method of a catalog, but may be useful in other cases.For example, one could enhance a standard
zope.catalog.catalog.ResultSet
like so (note this won’t work for theResultSet
defined here):from zope.catalog.catalog import ResultSet class PrefetchedResultSet(ResultSet, object): def __iter__(self): iterable = (uid, self.uidutil.getObject(uid) for uid in self.uids) for _, obj in CatalogPrefetchIterator(iterable, 512): yield obj
New in version 3.0.
-
class
nti.zope_catalog.catalog.
Catalog
(family=None)[source]¶ Bases:
zope.catalog.catalog.Catalog
An extended catalog. Features include:
- When manually calling
updateIndex()
orupdateIndexes()
, objects that providenti.zope_catalog.interfaces.INoAutoIndex
are ignored. Note that if you have previously indexed objects that now provide this (i.e., class definition has changed) you need toclear()
the catalog first for this to be effective. - Updating indexes can optionally ignore certain errors related to
persistence POSKeyErrors. Note that updating a single index does
this by default (since it is usually called from the
IObjectAdded
event handler) but updating all indexes does not since it is usually called by hand.
- When manually calling
-
class
nti.zope_catalog.catalog.
DeferredCatalog
(family=None)[source]¶ Bases:
nti.zope_catalog.catalog.Catalog
An implementation of
nti.zope_catalog.interfaces.IDeferredCatalog
.
Normazilation¶
There are several helpers for indexes that normalize values.
Mixin base classes.
-
class
nti.zope_catalog.mixin.
AbstractNormalizerMixin
[source]¶ Bases:
object
Base class for normalizing values. All methods are directed to
value()
by default; for more specific behaviour, override the corresponding method.
Dates¶
Support efficiently storing datetime values in an index, normalized.
-
class
nti.zope_catalog.datetime.
TimestampNormalizer
(resolution=2)[source]¶ Bases:
persistent.Persistent
,nti.zope_catalog.mixin.AbstractNormalizerMixin
Normalizes incoming Unix timestamps (or datetimes) to have a set resolution, by default minutes.
-
RES_DAY
= 0¶ Constant for normalizing to days.
-
RES_HOUR
= 1¶ Constant for normalizing to hours.
-
RES_MINUTE
= 2¶ Constant for normalizing to minutes.
-
RES_SECOND
= 3¶ Constant for normalizing to seconds.
-
RES_MICROSECOND
= 4¶ Constant for normalizing to microseconds.
-
-
class
nti.zope_catalog.datetime.
TimestampTo64BitIntNormalizer
(resolution=2)[source]¶ Bases:
nti.zope_catalog.datetime.TimestampNormalizer
Normalizes incoming Unix timestamps to have a set resolution, by default minutes, and then converts them to integers that can be stored in an
IntegerAttributeIndex
.Changed in version 2.0.0: Now subclass
TimestampNormalizer
and rework some internal attributes.Changed in version 2.0.0: Rename from
TimestampToNormalized64BitIntNormalizer
toTimestampTo64BitIntNormalizer
. Previously,FloatTo64BitIntNormalizer
was imported under this name.
-
nti.zope_catalog.datetime.
TimestampToNormalized64BitIntNormalizer
¶ Backwards compatibility alias.
alias of
nti.zope_catalog.datetime.TimestampTo64BitIntNormalizer
Numbers¶
Normalization of numbers.
-
class
nti.zope_catalog.number.
FloatTo64BitIntNormalizer
[source]¶ Bases:
nti.zope_catalog.mixin.AbstractNormalizerMixin
Normalizes incoming floating point objects to 64-bit integers.
Use this with a
zc.catalog.catalogindex.NormalizationWrapper
. Note that when you do so, the values returned by a method likezc.catalog.interfaces.IIndexValues()
will be integer representations, not floating point timestamps.
-
class
nti.zope_catalog.number.
PersistentFloatTo64BitIntNormalizer
[source]¶ Bases:
persistent.Persistent
,nti.zope_catalog.number.FloatTo64BitIntNormalizer
Persistent normalizer that can be stored in an
nti.zodb_catalog.field.IntegerAttributeIndex
.Changed in version 2.0.0: Now subclasses
FloatTo64BitIntNormalizer
.Changed in version 2.0.0: Rename from
FloatToNormalized64BitIntNormalizer
toPersistentFloatTo64BitIntNormalizer
.
-
nti.zope_catalog.number.
FloatToNormalized64BitIntNormalizer
¶ Backwards compatibility alias.
alias of
nti.zope_catalog.number.PersistentFloatTo64BitIntNormalizer
Strings¶
Helpers for indexing strings.
-
class
nti.zope_catalog.string.
StringTokenNormalizer
[source]¶ Bases:
nti.zope_catalog.mixin.AbstractNormalizerMixin
A normalizer for strings that are treated like tokens: strings are lower-cased and guaranteed to be unicode and leading and trailing spaces are removed.
This object accepts byte strings and decodes them using UTF-8.
Indexes¶
A number of different kinds of indexes are offered.
General¶
Support for working with zope.catalog.field
indexes.
All of the indexes we define are compatible with both
zope.catalog
query syntax (and internal attributes) and zc.catalog
syntax (and public attributes).
-
class
nti.zope_catalog.index.
NormalizingFieldIndex
(family=None)[source]¶ Bases:
nti.zope_catalog.index._ZipMixin
,zope.index.field.index.FieldIndex
,zope.container.contained.Contained
A field index that normalizes before indexing or searching.
Note
For more flexibility, use a
NormalizationWrapper
.
-
class
nti.zope_catalog.index.
CaseInsensitiveAttributeFieldIndex
(field_name=None, interface=None, field_callable=False, *args, **kwargs)[source]¶ Bases:
zope.catalog.attribute.AttributeIndex
,nti.zope_catalog.index.NormalizingFieldIndex
An attribute index that normalizes case. It is queried with a two-tuple giving the min and max values.
-
class
nti.zope_catalog.index.
ValueIndex
(family=None)[source]¶ Bases:
nti.zope_catalog.index._ZCApplyMixin
,nti.zope_catalog.index._ZCAbstractIndexMixin
,nti.zope_catalog.index._ZipMixin
,zc.catalog.index.ValueIndex
An index of raw values.
-
class
nti.zope_catalog.index.
AttributeValueIndex
(field_name=None, interface=None, field_callable=False, *args, **kwargs)[source]¶ Bases:
nti.zope_catalog.index.ValueIndex
,zc.catalog.catalogindex.ValueIndex
An index of values stored in a particular attribute.
-
class
nti.zope_catalog.index.
SetIndex
(family=None)[source]¶ Bases:
nti.zope_catalog.index._ZCAbstractIndexMixin
,nti.zope_catalog.index._SetZipMixin
,zc.catalog.index.SetIndex
An index of values that are multiple.
-
class
nti.zope_catalog.index.
AttributeSetIndex
(field_name=None, interface=None, field_callable=False, *args, **kwargs)[source]¶ Bases:
nti.zope_catalog.index.SetIndex
,zc.catalog.catalogindex.SetIndex
An index of values that are multiple and stored in an attribute.
-
class
nti.zope_catalog.index.
IntegerValueIndex
(family=None)[source]¶ Bases:
nti.zope_catalog.index._ZCApplyMixin
,nti.zope_catalog.index._ZCAbstractIndexMixin
,nti.zope_catalog.index._ZipMixin
,zc.catalog.index.ValueIndex
A “raw” index that is optimized for, and only supports, storing integer values. To normalize, use a
zc.catalog.index.NormalizationWrapper
; to store in a catalog and normalize, use aNormalizationWrapper
(which is an attribute index).
-
class
nti.zope_catalog.index.
IntegerAttributeIndex
(field_name=None, interface=None, field_callable=False, *args, **kwargs)[source]¶ Bases:
nti.zope_catalog.index.IntegerValueIndex
,zc.catalog.catalogindex.ValueIndex
An attribute index that is optimized for, and only supports, storing integer values. To normalize, use a
zc.catalog.index.NormalizationWrapper
; note that becausezc.catalog.catalogindex.NormalizationWrapper
is also an attribute index it cannot be used to wrap this class, and your normalizer will have to return an object that has the right attribute.
-
class
nti.zope_catalog.index.
NormalizingKeywordIndex
(family=None)[source]¶ Bases:
nti.zope_catalog.index._SetZipMixin
,zope.index.keyword.index.CaseInsensitiveKeywordIndex
,zope.container.contained.Contained
A case-insensitive keyword index supporting traditional queries as well as extent-based queries.
-
class
nti.zope_catalog.index.
AttributeKeywordIndex
(field_name=None, interface=None, field_callable=False, *args, **kwargs)[source]¶ Bases:
zope.catalog.attribute.AttributeIndex
,nti.zope_catalog.index.NormalizingKeywordIndex
An index for keywords stored in an attribute.
-
class
nti.zope_catalog.index.
NormalizationWrapper
(field_name=None, interface=None, field_callable=False, index=None, normalizer=None, is_collection=False)[source]¶ Bases:
nti.zope_catalog.index._ZCApplyMixin
,zc.catalog.catalogindex.NormalizationWrapper
An attribute index that wraps a raw index and normalizes values.
This class exists mainly to sort out the difficulty constructing instances by only accepting keyword arguments.
You should only call this constructor with keyword arguments; due to inheritance, mixing and matching keyword and non-keyword is a bad idea. The first three arguments that are not keyword are taken as field_name, interface and field_callable.
-
nti.zope_catalog.index.
stemmer_lexicon
(lang='english', stopwords=True)[source]¶ A lexicon for text indexes using zc.catalog.
-
class
nti.zope_catalog.index.
AttributeTextIndex
(field_name=None, interface=None, field_callable=False, *args, **kwargs)[source]¶ Bases:
zope.catalog.text.TextIndex
A 64-bit text index.
Example:
index = AttributeTextIndex('field', lexicon=stemmer_lexicon())
-
family
= <BTree family using 64 bits. Supports signed integer values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 and maximum unsigned integer value 18,446,744,073,709,551,615.>¶ We default to 64-bit btrees.
-
Topics¶
Support for writing topic indexes and the filtered sets that go with them.
-
class
nti.zope_catalog.topic.
TopicIndex
(family=None)[source]¶ Bases:
zope.index.topic.index.TopicIndex
,zope.container.contained.Contained
A topic index that implements
IContained
andICatalogIndex
for use with catalog indexes.To summarize, a topic index is a way to divide objects into a set of groups (aka topics). The groups are determined by the contents of this object, which are called filters. Each filter is conceptually like a mini-index itself, but in practice most of them are simply used to store group membership when some criteria are met; for that purpose the
ExtentFilteredSet
is ideal.-
family
= <BTree family using 64 bits. Supports signed integer values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 and maximum unsigned integer value 18,446,744,073,709,551,615.>¶ We default to 64-bit btrees.
-
apply
(query)[source]¶ Queries this index and returns the set of matching docids.
The query can be in one of several formats:
- A single string or a list of strings. In that case,
docids that are in all the given topics (by id) are returned.
This is equivalent to zc.catalog-style
all_of
operator. - A mapping containing exactly two keys,
operator
andquery
. The value foroperator
is eitherand
oror
to specify intersection or union, respectively. The value for query is again a string or list of strings. - A dictionary containing exactly one key, either
any_of
orall_of
, whose value is the string or list of string topic IDs.
- A single string or a list of strings. In that case,
docids that are in all the given topics (by id) are returned.
This is equivalent to zc.catalog-style
-
-
class
nti.zope_catalog.topic.
ExtentFilteredSet
(fid, expr, family=None)[source]¶ Bases:
zope.index.topic.filter.FilteredSetBase
A filtered set that uses an
zc.catalog.interfaces.IExtent
to store document IDs; this can make for faster, easier querying of other indexes.Create a new filtered extent.
Parameters: expr – A callable object of three parameters: this object, the docid, and the document. This will be available as the value of
getExpression()
. If you passNone
, you can override getExpression yourself.Caution
This is often a persistent object, so if you pass a filter, it must be picklable. In general and for the most flexibility, instead of passing something like
IFoo.providedBy
, instead pass a global (function) object in your own module.-
family
= <BTree family using 64 bits. Supports signed integer values from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 and maximum unsigned integer value 18,446,744,073,709,551,615.>¶ We default to 64-bit btrees
-
Changes¶
3.0.1 (2021-05-13)¶
- Fix the
ExtentFilteredSet
to only unindex documents that were previously indexed. This avoids an extrareadCurrent
call. See issue 12.
3.0.0 (2021-05-12)¶
Add support for Python 3.7, 3.8 and 3.9.
Note that
zopyx.txng3.ext
version 4.0.0, the current version at this writing, may or may not build on CPython 3, depending on how your compiler and compiler options treat undefined functions. See this issue.Also note that both PyPy 3.6 and 3.7 (7.3.4) are known to crash when running the test suite. PyPy2 7.3.4 runs the test suite fine.
When updating indexes in a catalog, first check if the type of each object to be visited implements
INoAutoIndex
. If it does, we can avoid prematurely activating persistent ghost objects. See issue 8.Require ZODB 5 in order to use the new
prefetch()
method.When adding or updating an index in a catalog, use ZODB’s prefetch method to grab chunks of object state data from the database. This can be substantially faster than making requests one at a time. This introduces a new class
CatalogPrefetchIterator
that may be useful in other circumstances. See issue 7.
2.0.0 (2017-11-05)¶
- Rename
TimestampToNormalized64BitIntNormalizer
toTimestampTo64BitIntNormalizer
for consistency. - Make
TimestampTo64BitIntNormalizer
subclassTimestampNormalizer
for simplicity. - Rename
FloatToNormalized64BitIntNormalizer
toPersistentFloatTo64BitIntNormalizer
for consistency and to reflect its purpose. - Make
PersistentFloatTo64BitIntNormalizer
subclassFloatTo64BitIntNormalizer
. - Add
IDeferredCatalog
and an implementation inDeferredCatalog
to allow creating catalog objects that don’t participate in event subscription-based indexing. This replacesIMetadataIndex
, which is now an alias for this object. See issue 3.
1.0.0 (2017-06-15)¶
- First PyPI release.
- Add support for Python 3.
TimestampNormalizer
also normalizes incoming datetime objects.- Fix extent-based queries for NormalizedKeywordIndex.
- 100% test coverage.