Assuming I use the embedded Elasticsearch

a. What is the added load on my server?

Initial indexing can potentially require large amounts of ram. We recommend allocating at least 6GB to the Bitbucket instance. As an example benchmark, our plugin indexed the Linux code base which is approximately 15 million lines of code and half a million commits in 2-3 minutes with 6GB of ram on non SSD harddrives with an internal node. Most of the strain on the system is generated by the initial indexing.

b. Are there special HW requirements to support it?

No, but SSD's improve indexing speed tremendously.

If I use the global search, will it apply permissions restrictions (meaning if I have no access to a project/repository it will omit them from results)?

Repository permissions are applied to search results.

How do I control the Elasticsearch process?

The internal node is embedded in the Bitbucket JVM. You can manually trigger indexing operations by going to global settings or the individual settings of the repository. If you want to see detailed logs on the Bitbucket side, just take a look at the logs. If you want even more detailed logs, you have to enable debug mode.

How can I re-index all?

Go to the admin panel in Bitbucket, there should be a Search Global Settings option now under addons. Click Reindex all and it will trigger a reindex. Make sure indexing is enabled.

How can I know indexing status?

There is a progress bar at the top of settings pages.  It's avalable for Project, Repository and Global Settings

After a code change, how soon should I expect the search to find it?

All pushed changes generate events that our event handlers capture. It should be a matter of seconds until its indexed.

How much disk space is used by this add-on?

The Elasticsearch node will worst case use around the same amount of space as Bitbucket is using to store repositories, so double your current disk-space to be safe.

Is there an API I can use for queries?

You plugin currently does not have an API, but the Elasticsearch node does. You cannot currently query the internal node because of security reasons. You must use an external node if you want to query the node. Check here for more info on querying Elasticsearch.

Please tell me, for every new commit, is the entire project indexed, or just the changes files in that commit?

For every new commit, the plugin will do a diff of the changes and then apply those changes to the index. It doesn't reindex the whole repository because it doesn't have to.

I would like to know how the plugin manage to index the different branches?

Data structure is similar to git. If file is different (based on git object id) then every branch has own entry. If file is same then it's shared across branches.

Which version of Elasticsearch do I need for an external node?

Smarter Search Version

Elasticsearch version compatibility



<= 3.1.4