All RPGs and Storygames by Tod Foley are now available at DrivethruRPG and RPGnow. Bring these games to your table!
I'd like to invite members of the open-source community, particularly (but not exclusively) those involved with PHP, to join in designing and developing a general-purpose ETL framework for data migration. The vendor name for packaging components of this project is soong, and git repos for existing components are under the GitLab account "soongetl".
Note: Finally having finished composition of this lengthy monologue, it's clear to me that it's very ambitious (some might say arrogant) of me to write with the expectation that this will grow into a large and robust open-source ecosystem. Very well - it is ambitious, and the effort may very well fall flat on its face. C'est la vie...Who am I?
I'm Mike Ryan - a lot of people in the Drupal community know me, but not so much the wider open-source community. Almost eleven years ago at a Drupal meetup in Boston, amongst general agreement that everyone hates to do data migration, Moshe Weitzman looked across the table at me and said "there's an opportunity here." Since then data migration into Drupal has been the primary focus of my professional life, first in partnership with Moshe, then as an Acquia employee and finally as a solo consultant. Over the years I've created several migration-related contrib modules for Drupal, was part of the team integrating migration into Drupal core for D8, and have been involved in dozens of real-world migration projects.Why am I doing this? I think we can do better
Just within Drupal, the migration framework can be improved:
- Each step of Drupal migration support has been a port of the previous - from the hook-based Drupal 6 version, to the inheritence-and-composition model in Drupal 7, to the plugin-based system in Drupal 8, technical debt has accumulated. I've wanted for a while to start over with a clean slate - given my experience (and others), what would we do differently starting from scratch? Can we step back and re-examine the assumptions we've been carrying forward?
- At a specific technical level, the biggest itch I've wanted to scratch is decoupling the components. Within the migration system as it is in D8 today, pretty much every component knows everything about every other component. At one point we had a destination plugin which was using some of the migration's source plugin configuration - that one made my eye twitch!
- There's also the coupling of the migration system with Drupal - in particular, migration classes *are* Drupal plugins (i.e., their interfaces extend PluginInspectionInterface) rather than being *managed by* Drupal plugins. I would like to see migration classes be all about migration, rather than worry about being plugins as well. And once the basic migration classes are no longer Drupal plugins, then it's a small step to them being entirely independent of Drupal...
With Drupal 8, we’ve often talked about “getting off the island” in terms of benefiting from much fine PHP work done outside of the Drupal community. We haven’t talked so much about going in the opposite direction - making our own fine work available for use beyond Drupal. To my knowledge, the only published example of this so far is Kris Vanderwater (EclipseGC) with the plugin library.
Likewise, we Drupal developers don’t have a monopoly on good migration ideas - by moving the general-purpose aspects of migration into a separate open source project, we have the opportunity to benefit from new ideas and new talent.The community we build
The major key to success for any large open-source project like this is a thriving community. After seeing open-source projects like Drupal grow organically - and face growing pains as they find themselves dealing with community problems reactively rather than proactively - if a community does form around this project, I would like to establish a supportive and welcoming tone from the beginning.
Diversity in particular remains an issue in the tech industry in general, and open-source especially - and a lack of diversity is difficult to correct after the fact. In building a community around this framework, my hope is that we draw a diverse set of developers in the beginning, in the hopes that seeding the garden well will be, if not self-sustaining, at least more sustainable. How to do that, I'm not certain - a concerted outreach effort could easily end up looking like Pokemon Go, searching for unique creatures to collect. Apart from starting with a good Code of Conduct, I'm open to suggestions!
Another aspect of community-building is providing opportunities for relative novices (whether new to open-source development, new to PHP, or new to migration). The proposed architecture involves myriad small, well-focused packages - an extractor here, a set of related transformers there, integrations for specific frameworks and APIs... Individual transformers, in particular, will generally be very simple. This ecosystem thus will provide ample opportunities for novices to gain experience with mentorship and also establish an online presence.
Now, all that being said, what about The Ethics of Unpaid Labor and the OSS Community (also see the recent Twitter discussion in the Drupal community)? In reaching out to underrepresented groups and to novices, we are reaching out to the people who have the least ability to work on open source for free. One way to ameliorate this effect may be to explicitly try to draw in students - whether in formal programs or teaching themselves software development - who will benefit from some free practical education and mentorship. Down the road, if this framework does start being adopted in real-world applications, we can look at ways to get sponsorships for people who maintain projects within the ecosystem. At any rate, as the community here grows I expect this will be an ongoing conversation.Selfishness
Yes, I'm willing to cop to selfish reasons to pursue this.
- Simple ego: I'm proud of the work I've done on migration in Drupal, and think it can be useful on a larger stage. Being old enough to see retirement on the horizon, I admit I'm thinking of this as my magnum opus - the last major contribution I make to open source. I would love to leave behind a significant piece of quality software with a vital community behind it.
- Money: I've done fine as a Drupal data migration specialist. I hope to do better by expanding my market beyond Drupal, working on a wider variety of migration projects. Yes, retirement is on the horizon but, given earlier attempts at consulting which went less well than my "migration period" has, my funds put that horizon farther out than I'd like...
Early last year I started playing around with a proof-of-concept in a single repo, getting a single basic ETL migration scenario running with a decoupled class structure based on the basic architecture of the Drupal migration system. Much of the work after getting the initial POC running was figuring out appropriate boundaries between components, and gradually introducing features beyond the most basic ones I started with. And then breaking pieces out into separate source repos, and figuring out those boundaries.My role
This will certainly change according to the number and skills of contributors who join into this effort (assuming there are some!), but what I'm aiming for in terms of my own role:
- Primary architect of version 1 of Soong. This would mean being the primary maintainer of architecture documentation and the repository of central interfaces/base classes. Per "selfishness" above - I have an architectural vision I want to see brought to fruition. Others may take it in different directions after that, but V1 is mine! tl;dr - I don't want to be BDFL; I do want to be BDF1.
- Community leader. Per "community" above, I have a vision for building a diverse and vibrant open-source community from the ground up. Unlike the technical architecture, however, this plays less to my strengths, so I will be happy to defer as better-suited people show leadership in the community.
- Mentorship. I'd like to help people up their development skills, their open-source involvement, and their understanding of the pits and perils of data migration.
After having it in the back of my head for a few years, I finally started creating repos and putting my thoughts into actual interfaces and classes several months ago. Why did I wait until now to share my work with the larger community? I certainly felt seen when I read this:
Frankly, there's an element of imposter syndrome here - I wanted to be sure I wasn't exposing any dumb ideas! Well, enough of that - instead, I now stipulate that you will find dumb things I did here, and ask that you help smartify them.The architecture itself
- A Task accepts configuration defining a migration process, and implements operations - most notably migrate, but it may also support other operations like rollback, status, analyze, … The following steps describe the migrate operation.
- The task constructs the configured Extractor, which obtains data from a source such as a SQL query, a CSV file, an XML/JSON API, etc.
- Iterating over the extractor returns one DataRecord (collection of named DataProperty instances) at a time containing source data. The task creates an empty DataRecord representing the destination data.
- The task configuration defines a transform pipeline keyed by destination property names. For each of these properties, a sequence of one or more Transformer classes with corresponding configuration is invoked to determine the destination property value - usually, the first one will be configured to accept one or more source property names, and the results will be fed to subsequent transformers, with the final result assigned to the named property in the destination DataRecord.
- The destination DataRecord is passed to the configured Loader to be loaded into the destination store - a SQL database, a CSV file, etc.
- If an optional KeyMap is configured within the task, it is used to store the mapping from the source record's unique key to the destination record's unique key. This enables keyed relationships to be maintained even if keys change when migrating, as well as enabling rollback.
To try out a couple of working demos, git clone email@example.com:soongetl/poc.git and follow the README.Initial technical priorities
- One of those infamous hard problems in computer science is naming things. Before we go too far, let's figure out how best to name things - I think Extractor/Transformer/Loader are pretty solid, but let's discuss whether other components (like Task) could use better names. Also, let's decide what naming conventions for implementations should look like - e.g., should CSV extractor and loader classes both be named CSV (or for that matter, Csv) with namespaces alone distinguishing them, or should they be CSVExtractor and CSVLoader?
- The initial architecture, as I've said before, comes from my narrow experience in Drupal. I'm sure there are plenty of other good migration ideas out there - maybe there's even a package I've missed that's good enough that this effort would better be directed towards improving it rather than starting from scratch. I did do some research last year and did not find any PHP ETL packages that appeared to have wide adoption or as much flexibility, but with more eyes on it (eyes that have seen more beyond Drupal than I have) let's see if we can do a thorough review of prior art and see if there are some good ideas which may influence this effort. And let's look beyond PHP as well - are there ETL frameworks written in other object-oriented languages which may provide some architectural inspiration?
- Review the boilerplate for Soong code repos (based on https://github.com/thephpleague/skeleton) - let's go over what we've got there (especially the code of conduct and contributing guidelines).
- Test all the things! Before adding new stuff, we need to add tests for the existing components, and set up automated testing on Gitlab.
- For V1, require PHP 7.1 and leverage strict type checking. I expect future versions to require PHP 7.4 and leverage typed object properties.
- The central interface package soong/soong ideally should not depend on anything other than PSR interfaces. It should be approached as if it were a PSR itself - a completely general interface for ETL functionality not dependent on any non-standard interfaces.
Again, I know I am getting way ahead of myself here by imagining an active open-source community will quickly spring up here. I have talked to Drupal people about my ideas on occasion, and I expect there will be some interest there, but I very much hope other open-source developers can join this effort and provide different perspectives. I do believe strongly that a standard ETL library with a core of simple standard interfaces (making a simple move-my-stuff-from-here-to-there application a breeze) plus the flexibility to build complex systems to handle many types of data will be extremely valuable across many domains.
If I may try your patience a bit longer - I've spent a substantial portion of my time since my last contract pulling these thoughts together, and I am now in need of paid work (contact me if you need some data migration done!). I may fantasize about being sponsored to work fulltime on Soong, or be hopeful there's someone with a project that they think will benefit from Soong and thus I can make progress here in the course of solving their migration problem. Realistically, my next contract (or employment) most likely will not involve Soong development, so once I'm working I won't have as much time to manage this project - let's hope plenty of people join in to pick up my slack!
If you've made it this far, thank you for your time and I look forward to your merge requests!Tags Drupal Planet Drupal PHP Migration Use the Twitter thread below to comment on this post: January 22, 2019
So far in our series of posts about helping you become a better Drupal developer, we’ve talked a lot about contribution from an individual point of view. But Drupal is community first! Even if you believe you’ve already reached your potential as a developer, remember people before you said and subscribed to the motto: Come. Continue reading...
The post Grow as a Drupal developer: embrace the community! appeared first on Manifesto.
We've heard of test-driven development, behaviour-driven development, feature-driven development and someone has probably invented buzzword-driven development by now. Here's my own new buzzword phrase: review-driven development. At ComputerMinds, we aim to put our work through peer reviews to ensure quality and to share knowledge around the team. Chris has recently written about why and how we review our work. We took some time on our last team 'CMDay' to discuss how we could make doing peer reviews better. We found ourselves answering this question:Why is reviewing hard? How can we make it easier?
We had recently run into a couple of specific scenarios that had triggered our discussion. For one, pressure to complete the work had meant reviews were rushed or incomplete. The other scenario had involved such a large set of code changes that reviewing them all at the end was almost impossible. I'm glad of the opportunity to reflect on our practice. Here are some of the points we have come away with - please add your own thoughts in the comments section below.1. Coders, help your reviewers
The person that does the development work is the ideal person to make a review easy. The description field of a pull request can be used to write a summary of changes, and to show where the reviewer should start. They can provide links back to the ticket(s) in the project's issue tracking system (e.g. Redmine/Jira), and maybe copy across any relevant acceptance criteria. The coder can chase a colleague to get a review, and then chase them up to continue discussions, as it is inevitable that reviewers will have questions.2. Small reviews are easier
Complicated changes may be just as daunting to review as to build. So break them up into smaller chunks that can be reviewed easier. This has the massive benefit of forcing a developer to really understand what they're doing. A divide & conquer approach can make for a better implementation and is often easier to maintain too, so the benefits aren't only felt by reviewers.3. Review early & often
Some changes can get pretty big over time. They may not be easy to break up into separate chunks, but work on them could well be broken up into iterations, building on top of each other. Early iterations may be full of holes or @TODO comments, but they still reveal much about the developer's intentions & understanding. So the review process can start as early as the planning stage, even when there's no code to actually review. Then as the changes to code take shape, the developer can continually return to the same person every so often. They will have contextual knowledge growing as the changes grow, to understand what's going on, helping them provide a better review.4. Anyone can review
Inevitably some colleagues are more experienced than others - but we believe reviews are best shared around. Whether talking about your own code, or understanding someone else's code, experience is spread across the team. Fresh eyes are sometimes all that's needed to spot issues. Other times, it's merely the act of putting your own work up for scrutiny that forces you to get things right.5. Reviewers, be proactive!
Developers like to be working, not waiting for feedback. Once they've got someone to agree to review their work, they have probably moved onto solving their next problem. However well they may have written up their work, it's best for the reviewer to chase the developer and talk through the work, ideally face-to-face. Even if the reviewer then goes away to test the changes, or there's another delay, it's best for the reviewer to be as proactive as possible. Clarify as much as needed. Chase down the answers. Ask seemingly dumb questions. Especially if you trust the developer - that probably means you can learn something from them too!6. Use the tools well
Some code changes can be ignored or skipped through easily. Things like the boilerplate code around features exports in Drupal 7, or changes to composer.lock files. Pointers from the developer to the reviewer of what files/changes are important are really helpful. Reviewers themselves can also get wise as to what things to focus on. Tools can help with this - hiding whitespace changes in diffs, the files tab of PRs on github, or three-way merge tools, for example. Screenshots or videos are essential for communicating between developer & reviewer about visual elements when they can't meet face-to-face.7. What can we learn from drupal.org?
The patch-based workflow that we are forced to use on drupal.org doesn't get a lot of good press. (I'm super excited for the gitlab integration that will change this!) But it has stood the test of time. There are lessons we can draw from our time spent in its issue queues and contributing patches to core and contrib projects. For example, patches often go through two types of review, which I'd call 'focussed nitpicks' and the wider 'approach critiques'. It can be too tempting to write code to only fulfil precise acceptance criteria, or to pass tests - but reviewers are humans, each with their own perspectives to anticipate. Aiming for helpful reviews can be even more useful for all involved in the long-run than merely aiming to resolve a ticket.8. Enforcing reviews
We tailor our workflow for each client and project - different amounts of testing, project management and process are appropriate for each one. So 'review-driven development' isn't a strict policy to be enforced, but a way of thinking about our work. When it is helpful, we use Github's functionality to protect branches and require reviews or merges via pull requests. This helps us to transparently deliver quality code. We also find this workflow particularly handy because we can broadcast notifications in Slack of new pull requests or merges that will trigger automatic deployments.What holds you back from doing reviews? What makes a review easier?
I've only touched on some the things we've discussed and there's bound to be even more that we haven't thought of. Let us know what you do to improve peer reviewing in the comments!
It happens sometimes to remove users and their contents without noticing how bad it could be. This module tries to prevent this from happening and also adds a deleting user method that allows you to assign all of the contents made by the user you are deleting to some other one.
In nutshell this module adds:
- extra option to assign content to new user and make it default
- add warning styles and extra text for the core option for deleting content with the user
Meet Gabriele Maira, also known as Gabi by friends and as Gambry by the Drupal community. With over 15 years of experience working with PHP and over 10 working with Drupal, Gabriele is currently the PHP/Drupal Practice lead at the London-based Manifesto Digital. Read about his beginnings with open source and why he thinks every Drupal developer should attend a Sprint at least once in their life.READ MORE
Instances of advanced search can be seen through different things. Take space research for instance. In their perpetual effort to explore Universe and search for Earth-like planets or a star that is similar to our Sun, Scientists wind up discovering interesting things. NASA’s Hubble Space Telescope was used to spot a star called Icarus, named after Greek mythological figure, which is the most distant star ever viewed and is located halfway across the universe.
Planting the seed in 2004
On the other side of the spectrum, Apache Solr is making huge strides with its advanced search capabilities. Enterprise-level search is a quintessential necessity for your online presence to be able to thrive in this digital age. Big organisations like Apple, Bloomberg, Marketo, Roku among others are opting for Apache Solr for its advanced search features. Amalgamation of Drupal and Apache Solr can be a remarkable solution for a magnificent digital presence.
John Thuma, in one of his blog posts, stated that tracing the history of Apache Solr would take us back to the year 2004 when it was an in-house project at CNET Networks used for offering search functionalities for the company’s website. It, then, donated it to the Apache Software Foundation in 2006.
Later, the Apache Lucene and Solr projects combined in 2010 with Solr becoming a sub-project of Lucene. It has witnessed an awful lot of alterations since then and is now a very significant component in the market.Uncloaking Apache Solr
Solr is a standalone enterprise search server with a REST-like API. You put documents in it (called "indexing") via JSON, XML, CSV or binary over HTTP. You query it via HTTP GET and receive JSON, XML, CSV or binary results. - Lucene.apache.org
As an enterprise-capable, open source search platform, Apache Solr is based on the Apache Lucene search library and is one of most widely deployed search platforms in the world.
It is written in Java and offers both a RESTful XML interface and a JSON API that enables the development of search applications. Its perpetual development by an enormous community of open source committers under the direction of the Apache Software Foundation has been a great boost.
Apache Solr is often debated alongside Elasticsearch. There is even a dedicated website called solr-vs-elasticsearch that compares both of them on various parameters. It states that both the solutions have support for integration with open source content management system like Drupal. It depends upon your organisation’s needs to select from either one of them.
For instance, if your team comprises a plentitude of Java programmers, or you already are using ZooKeeper and Java in your stack, you can opt for Apache Solr. On the contrary, if your team constitutes PHP/Ruby/Python/full stack programmers or you already are using Kibana/ ELK stack (Elasticsearch, Logstash, Kibana) for handling logs, you can choose Elasticsearch.Characteristics of Apache Solr
Following are the features of Apache Solr:Advanced search capabilities
- Spectacular matching capabilities: Apache Solr is characterised by the advanced full-text search capabilities. It enables spectacular matching capabilities comprising of phrases, grouping, wildcards, joins and so on, across any data type.
- A wide array of faceting algorithms: It has the support for faceted search and filtering that enables you to slice and dice your data as needed.
- Location-based search: It offers out-of-the-box geospatial search functionalities.
- Multi-tenant architecture: It offers multiple search indices that streamlines the process of segregating content and users.
- Suggestions while querying: There is support for auto-complete while searching (typeahead search), spell checking and many more.
It is optimised for the colossal spike in traffic. Also, Solr is built on Apache Zookeeper which makes it easy to scale up or down. It has in-built support for replication, distribution, rebalancing and fault tolerance.Support for standards-based open interfaces and data formats
It uses the standards-based open interfaces like XML, JSON and HTTP. Furthermore, you do not have to waste time converting all the data to a common representation as Solr supports JSON, CSV, XML and many more out-of-the-box.Responsive admin UI
It has the provision for an out-of-the-box admin user interface that makes it easier to administer your Solr instances.Streamlined monitoring
Solr publishes truckload of metric data via JMX that assists you in getting more insights into your instances. Moreover, the logging is monitorable as the log files can be easily accessed from the admin interface.Magnificent extensions
It has an extensible plugin architecture for making it simple to plugin both index and query time plugins. It also provides optional plugins for indexing rich content, detecting language, clustering search results amongst others.Configuration management
Its flexibility and adaptability for easy configuration are top-notch. It also offers advanced configurable text analysis, that means, there is support for most of the widely spoken languages in the world and a plethora of analysis tools that makes the process of indexing and querying your content flexible.Performance optimisation
It has been tuned to govern largest of sites and its out-of-the-box caches have fine-grained controls that assist in optimising performance.Amazing Indexing capabilities
Solr leverages Lucene’s Near Real-Time Indexing capabilities that ensure that the user sees the content whenever he or she wants to. Also, its built-in Apache Tika simplifies the process of indexing rich content like Microsoft Word, Adobe PDF and many more.Schema management
You can leverage Solr’s data-driven schemaless mode in the incipient stage of development and can lock it down during the time of production.Security
Solr has robust built-in security like SSL (Secure Sockets Layer), Authentication and role-based authorisation.Storage
Lucene’s advanced storage options like codecs, directories among others ensures that you can fine-tune your data storage needs that are applicable for your application.Leverage Apache UIMA
Enhancement of content can be done with its advanced annotation engines. It incorporates Apache UIMA for leveraging NLP (Natural Language Processing) and other tools for your application.Integrating Apache Solr with Drupal
Drupal’s impressive flexibility empowers digital innovation and gives the power to the users to build almost anything. It has the provision for integration of your website with Solr platform. Drupal’s Search API Solr Search module provides a Solr backend for the Drupal Search API module.Drupal’s Search API Solr Search module provides a Solr backend for the Drupal Search API module.
To begin with, you need to have Apache Solr installed on your server. This is followed by the validation of the Solr server’s status using Terminal. It is succeeded by the installation of Search API Solr Search module using Composer.
Once the installation of Search API Solr Search module is done, the process of configuration of Solr ensues. This involves the creation of collection which is basically a logical index linked to a config set.Source: OSTraining
Then, Drupal’s default search module is uninstalled for negating any performance issues and the Search API Solr Search module is enabled. You can, then, move on to the process of configuration of the Search API. Finally, you can test the Search API Solr Search module.Source: OSTrainingCase study
The Rainforest Alliance (RA), which is an international non-profit organisation working towards the development of strong forests, healthy agricultural work landscapes, and burgeoning communities via creative collaboration, leveraged the power of Drupal to revamp their website with the help of a digital agency.
Drupal was great because of its deep integrations with Apache Solr that enabled nuanced content relation engine.
RA has built a repository of structured content for supporting its mission and the content is primarily exhibited as long-form text with a huge variety of metadata and assets associated with each part of the content. It wanted to revamp the site and enable the discovery of new content on the site with the help of the automatic selection of related content. It also required the advanced permission features and publishing workflows.
Drupal turned out to be an astounding choice for fulfilling RA’s requirement of portable and searchable content. It was also great because of its deep integrations with Apache Solr that enabled nuanced content relation engine. Solr was leveraged for powering various search interfaces. Furthermore, Drupal’s wonderful content workflow features made it a perfect choice.
Solr offered ‘more like this’ (MLT) functionality that was more robust than just tagging content and showing other content with the same taxonomy terms. Search API Solr Search module, which provides a Solr backend for the Search API module, was utilised for providing the interface to govern the servers and indexes. Then, with a custom block, MLT was leveraged for assisting the process generating related content lists.
Page manager module, in combination with Layout Plugin and Panels modules, was used to build specialised landing pages in the form of specialised page manager pages with many of them having their own layouts. Different modules were utilised from within the media ecosystem of Drupal were very beneficial in administering images, embedding videos, and so on. Entity Embed, Entity Browser and Inline Entity form were magnificent for a great editorial experience for content teams.Conclusion
Apache Solr is a great solution for enabling enterprise-level search and can make a world of difference in combination with Drupal for your digital presence.
We have been empowering our partners in their efforts digital transformation dreams with our expertise in Drupal development.
Ping us at firstname.lastname@example.org to extract advanced Solr features with Drupal.blog banner blog image Apache Solr Drupal Drupal 8 Elasticsearch Search API Search API Solr Search Blog Type Articles Is it a good read ? On
Elasticsearch has been meritorious for The Guardian, one of the most reputed news media, by giving them the freedom to build a stupendous analytics system in-house rather than depending on a generic, off-the-shelf analytics solution. Their traditional analytics package was horrendous and was extremely sluggish consuming an enormous amount of time. The Elasticsearch-powered solution has turned out to be an enterprise-wide analytics tool and helped them understand how their content is being consumed.
Why is such a large organisation like The Guardian choosing Elasticsearch for its business workflow? Elasticsearch is all about full-text search, structured search, analytics, intricacies of confronting with human language, geolocation and relationships. Drupal, one of the leading content management frameworks, is a magnificent solution for empowering digital innovation and can help in implementing elastic search. Before we look at Drupal’s capability in implementing elastic search ecosystem, let’s unwrap Elasticsearch first.
Elasticsearch is an open source, broadly distributable, RESTful search and analytics engine which is built on Apache Lucene. It can be accessed through an extensive and elaborate API. It enables incredibly fast searches for supporting your data discovery applications. It is used for log analytics, full-text search, security intelligence, business analytics and operational applications.Elasticsearch is a distributed, scalable, real-time search and analytics engine - Elastic.io
It enables you to store, search and assess the voluminous amount of data swiftly and in near real-time. In general, it is leveraged as the underlying engine/technology for powering applications that have sophisticated search features and requirements.
How does Elasticsearch work? With the help of API or ingestion tools like Logstash, data is sent to Elasticsearch in the form of JSON documents. The original document is automatically stored by Elasticsearch and a searchable reference is added to the document in the cluster’s index. Elasticsearch API can, then, be utilised for searching and retrieving the document. Kibana, an open-source visualisation tool, can also be leveraged with Elasticsearch for visualising the data and create interactive dashboards.
Elasticsearch is often debated alongside Apache Solr. There is even a dedicated website called solr-vs-elasticsearch that compares both of them on various metrics. Both the solutions accompany itself with support for integration with open source content management system like Drupal. It depends upon your organisation’s needs to select from either one of them.
For instance, if your team comprises a superabundance of Java programmers, or you already are using ZooKeeper and Java in your stack, you can opt for Apache Solr. On the contrary, if your team includes PHP/Ruby/Python/full stack programmers or you already are using Kibana/ELK stack (Elasticsearch, Logstash, Kibana) for handling logs, you can choose Elasticsearch.Merits of Elasticsearch
Following are some of the benefits of Elasticsearch:
- Speed: Elasticsearch helps in leveraging and accessing all the data at a fast clip. Also, it makes it simple to rapidly build applications for multiple use cases.
- Performance: Being highly distributable, it allows the processing of a colossal amount of data in parallel and swiftly finds the best matches for your search queries.
- Scalability: Elasticsearch offers provision for easily operating at any scale without comprising on power and performance. It allows you to move from prototype to production boundlessly. It scales horizontally for governing multiple events per second while simultaneously handling the distribution of indices and queries across the cluster for efficacious operations.
- Integration: It comes integrated with visualisation tool Kibana. It also offers integration with Beats and Logstash for streamlining the process of transforming source data and loading it into Elasticsearch cluster.
- Safety: It detects failures for keeping the cluster and the data safe and available. With cross-cluster replication, a secondary cluster can be leveraged as a hot backup.
- Real-time operations: Elasticsearch operations like reading or writing data is usually performed in less than a second.
- Flexibility: It can pliably handle application search, security analytics, metrics, logging among others.
For designing a full Elasticsearch ecosystem in Drupal, Elasticsearch Connector, which is a set of modules, can be utilised. It leverages the official Elasticsearch PHP library and was built with the objective of handling large sets of data at scale. It is worth noting that this module is not covered by security advisory policy.
Elasticsearch Connector module can be utilised with a Drupal 8 installation and configured so that Elasticsearch receives the content changes
Elasticsearch Connector module can be utilised with a Drupal 8 installation and configured so that Elasticsearch receives the content changes. At first, you need to download a stable release of Elasticsearch and start it. You can, then, move ahead and set up Search API. This is followed by the process of connecting Drupal to Elasticsearch with the help of Elasticsearch Connector module which involves the creation of cluster or the collection of node servers where all the data will get stored or indexed.
This is succeeded by the configuration of Search API. It offers an abstraction layer to allow Drupal to push content alterations to different servers such as Elasticsearch, Apache Solr, or any other provider that has a Search API compatible module. The indexes are created in each of those servers with the help of Search API. These indexes are like buckets where the data can be pushed and can be searched in different ways. Subsequently, indexing of content and processing of data is done.
We rebuilt the website of PMG using progressively decoupled Drupal, React and Elasticsearch Connector module among others.
To do the mapping and indexing on Elastic Server, ElasticSearch Connector and Search API modules were leveraged. The development of Elastic backend architecture was followed by the building process of the faceted search application with React and the incorporation of the app in Drupal as block or template page.
The project structure for the search was designed and developed in the sandbox with modern tools like Babel and Webpack and third-party libraries like Searchkit. Searchkit is a suite of React components that interact directly with your ElasticSearch cluster where every component is built using React and can be customised as per your needs. Searchkit was of immense help in this project.
Logstash and Kibana, which are based on Elasticsearch, were integrated on the Elastic Server. This helped in collected, parsing, storing and visualising the data. The app in the Sandbox was built for the production and all the CSS/JS was integrated inside the Drupal as a block thereby making it a progressively decoupled feature.
The world is floating over a cornucopia of data. There is simply no end to the growth in the amount of data that is flowing through and produced by our systems. Existing technology has laid emphasis on how to store and structure warehouses replete with data.
But when it comes to making decisions in real time informed by that data, you need something like an Elasticsearch for searching and analysing data in real-time. Drupal can be a wonderful solution for implementing Elasticsearch ecosystem with its suite of modules.
We have been steadfast in our goals of empowering digital innovation with our suite of services.
Contact us at email@example.com to reap the rewards of Elasticsearch and ingrain your digital presence with advanced search capabilities.blog banner blog image Elasticsearch Drupal Drupal 8 ReactJS Apache Solr Search API Elasticsearch Connector Blog Type Articles Is it a good read ? On
This module adds extra options to customize the Entity Embed Dialog:
1) You can choose a custom dialog title to display during the selection step.
2) You can choose a custom dialog title to display during the embed step.
3) You can choose a custom label for the "back" button that returns to the selection step.
4) You can select a view to display the entity embed, which allows for a nicer UI experience for the editor. You can customize the view (or create your own) to add images and an edit button (a default view is available out of the box with the module).
What happened since last month? In a nutshell:
- 2.0 & 2.1 released :)
- Usage continues to rise: ~400 → ~1700 sites, 50% of those on 2.0 1
- Gabe proposed JSON:API profiles for versioning/revisions and multiple-arity relationships, to take advantage of the upcoming 1.1 version of the spec
- New core patch to bring JSON:API to Drupal core: #2843147-101
- Several refactors of internals 2 that pave the path for hypermedia links and partial caching
- JSON:API Extras is kept in sync — about 300 of you use that with JSON:API 2.x.
Work-arounds for two very common use cases are no longer necessary: decoupled UIs that are capable of previews and image uploads3.
- File uploads work similarly to Drupal core’s file uploads in the REST module, with the exception that a simpler developer experience is available when uploading files to an entity that already exists.
- Revision support is for now limited to retrieving the working copy of an entity using ?resourceVersion=rel:working-copy. This enables the use case we hear about the most: previewing draft Nodes. 4 Browsing all revisions is not yet possible due to missing infrastructure in Drupal core. With this, JSON:API leaps ahead of core’s REST API.
Please share your experience with using the JSON:API module!
Note that usage statistics on drupal.org are an underestimation! Any site can opt out from reporting back, and composer-based installs don’t report back by default. ↩︎
Unfortunately only Node and Media entities are supported, since other entity types don’t have standardized revision access control. ↩︎
The file replace module is a small utility providing site administrators with the possibility to replace files, keeping the file uri intact. This is useful in cases where a file is linked or used directly but needs to be updated occasionally.Installation and usage
Install the module as you would any contrib module, preferably with composer:
composer require drupal/file_replace
This module is helpful for developers. Many times it needs to restrict code
to be functional on development site only or staging site only. This can be
achieved with this module. So once configured you can make your code to run on
any of the environment you want to restrict on - Development, Staging or Production(Live).
This module allows you to send warnings and more serious errors in the error log via email to a specified address. Optionally, every PHP entry can be sent, since this is definitely a problem on the site.