2013. augusztus 30.

Data Migration through Migrate API

Tóth Ákos
Cloud Engineer

Migrate API is the most versatile module available for moving arbitrary data between Drupal sites of the same, or even different core version. It provides a unified way of handling any data type, and a simple to implement interface for making more elements available for migration. This tutorial will show you how to create your own migration handler.


When upgrading, importing or simply migrating a Drupal installation to another location, it is convenient if all the different data types are handled in the same area - otherwise, building a clean, easy to understand workflow becomes difficult. Migrate not only centralizes the process to a single interface, but also is internally capable of providing the workflow itself. But how does it work on the backend?


Migration Structure

The word migration implies that the module moves complete sites. This is not entirely true, as migration is prepared to handle alternative sources of data, such as CSV, JSON or XML files, or even PHP arrays. Unified, this is called a Migrate Source. Generally, unless you require something very exotic, the source handlers already provided with the Migrate module should suffice for any arbitrary migration, and I won't be covering it for this tutorial. The API also needs to know where to put the data. This is a Migrate Destination. A destination handler is aware of the fields that can, or need to be filled for a single piece of data, and it is capable of verifying its consistency and saving it as it was intended to be used. Each destination handler lives in its own class, implementing and overriding the necessary methods of the base class MigrateDestination. The Migrate module provides destination handlers for most core data, and the Commerce Migrate module extends this with the Commerce entities, however, it is not unusual that this is insufficient. Here's an example destination handler. Note that all example code below builds on the Migrate 2.6 API, which, at the time of writing this blog entry, is still in development.

  1. /**
  2.  * Our example migration destination handler class.
  3.  */
  4. class MigrateDestinationExampleData extends MigrateDestination {
  5.   /**
  6.    * Returns the key schema.
  7.    *
  8.    * This function is intended to return the Schema API definition for the
  9.    * primary key of the example data type. Note that a primary key is not
  10.    * necessarily a single field - for example, a Drupal 7 field instance is
  11.    * identified by field_name, entity_type and bundle, and thus has three
  12.    * fields in its primary key.
  13.    * 
  14.    * @return array
  15.    *   An associative array, where the keys are field names, and the values
  16.    *   are valid Schema API definitions for the field.
  17.    */
  18.   public static function getKeySchema() {
  19.     return array(
  20.       'example_machine_name' => array(
  21.         'type' => 'varchar',
  22.         'length' => 32,
  23.         'not null' => TRUE,
  24.         'description' => 'The machine-readable name for this example.',
  25.       ),
  26.     );
  27.   }
  29.   /**
  30.    * Provides a list of available fields on this destination.
  31.    * 
  32.    * @return array
  33.    *   An associative array, where the keys are field machine names, and the
  34.    *   values are localized field labels.
  35.    */
  36.   public function fields() {
  37.     return array(
  38.       'example_machine_name' => t('Machine name'),
  39.       'example_value' => t('Value'),
  40.     );
  41.   }
  43.   /**
  44.    * Prepares an item for migration.
  45.    * 
  46.    * At this point, $imported already contains the data defined by the field
  47.    * mappings. However, all the data is raw - some of it may need to be
  48.    * reprocessed to a different form.
  49.    * 
  50.    * It is advised to keep your imported data in an object form, as that makes
  51.    * manipulating the data in the prepare function possible.
  52.    * 
  53.    * @param mixed $imported
  54.    *   The data being imported, pre-populated according to field mappings.
  55.    * @param stdClass $source_row
  56.    *   The row of data provided by the source handler.
  57.    */
  58.   public function prepare($imported, stdClass $source_row) {
  59.     // Make sure there is an example value, in case it disappeared.
  60.     if (empty($imported->example_value)) {
  61.       $imported->example_value = 42;
  62.     }
  63.   }
  65.   /**
  66.    * Imports an item.
  67.    * 
  68.    * This function handles the validation and saving of data. Any final stages
  69.    * of processing are also handled here. You are responsible for making sure
  70.    * your data is valid, and that it is saved.
  71.    * 
  72.    * @param mixed $imported
  73.    *   The data being imported, already prepared through the prepare methods.
  74.    * @param stdClass $row
  75.    *   The row of data provided by the source handler.
  76.    * 
  77.    * @return array|FALSE
  78.    *   On success, an array whose values are the field values for each primary
  79.    *   key field in the newly imported data.
  80.    *   On failure, the function returns FALSE.
  81.    */
  82.   public function import($imported, stdClass $row) {
  83.     // Save the example data into the database.
  84.     $success = example_data_save($imported);
  86.     if ($success) {
  87.       // If the saving was successful, return the primary key, so that the API
  88.       // knows where an old data was moved.
  89.       return array($imported->example_machine_name);
  90.     }
  91.     else {
  92.       // If the saving failed, return FALSE - Migrate will indicate this as
  93.       // an error after the batch process is complete.
  94.       return FALSE;
  95.     }
  96.   }
  98.   /**
  99.    * Take any additional action after an item has finished importing.
  100.    *
  101.    * @param mixed $imported
  102.    *  Object to build. This is the complete object after saving.
  103.    * @param stdClass $source_row
  104.    *  The row of data provided by the source handler.
  105.    */
  106.   public function complete($imported, stdClass $source_row) {
  107.     // Retrieve the current migration handler object. This represents the
  108.     // migration that is currently in progress.
  109.     $migration = Migration::currentMigration();
  111.     // Set a message on the migration to let the world know that the import
  112.     // succeeded.
  113.     $migration->saveMessage('I have imported an item.', Migration::MESSAGE_INFORMATIONAL);
  114.   }
  116.   /**
  117.    * Roll back an imported item.
  118.    * 
  119.    * This function is expected to return the system into the state it was in
  120.    * before this specific item was imported, assuming the rollbacks happen in
  121.    * the reverse order as the imports. In any case, the item identified by the
  122.    * IDs should be deleted or hidden.
  123.    *
  124.    * @param $ids
  125.    *  Array of fields representing the key.
  126.    *  This array doesn't work like you'd expect it does - the keys are actually
  127.    *  in the form of 'destid[0-9]+', where the number represents that this is
  128.    *  the nth primary key field defined in the key schema, indexed from 1.
  129.    * 
  130.    * @return
  131.    *  TRUE on success, FALSE on failure.
  132.    */
  133.   public function rollback(array $ids) {
  134.     // Retrieve the new machine name of the item being rolled back.
  135.     $machine_name = $ids['destid1'];
  137.     // Delete this item.
  138.     example_data_delete($machine_name);
  140.     // Assume that the deletion succeeded and return TRUE.
  141.     return TRUE;
  142.   }
  143. }

A migration also requires a handler which serves as a bridge between the source and the destination. This bridge is responsible for providing the available source fields, and the default mapping of fields from source to destination. Migrate provides an interface for the user to configure any further mapping, and the moving of the mapped data is done automatically and internally - as such, the only task for you, the developer, is to make sure the data is sane in the destination handler, and to make sure all fields (which make sense) are available for migration in the migrate handler. An example migrate handler is below.

  1. /**
  2.  * The migration handler for the Example Data migration.
  3.  */
  4. class ExampleDataMigrateExampleMigration extends Migration {
  5.   public function __construct(array $arguments) {
  6.     parent::__construct($arguments);
  8.     // A localized description of the data type, which will be displayed on
  9.     // the migration page.
  10.     $this->description = t('Migrate Example Data between Drupal sites.');
  12.     // An array of dependencies, for hierarchical migrations. Each array value
  13.     // must be the machine name for a specific migration (which is not
  14.     // necessarily the same as the class name).
  15.     $this->dependencies = array();
  17.     // Define the source key schema in a similar manner to the destination key
  18.     // schema. This ensures Migrate API can make a distinction between two
  19.     // data items.
  20.     $source_key = array(
  21.       'example_machine_name' => array(
  22.         'type' => 'varchar',
  23.         'length' => 32,
  24.         'not null' => TRUE,
  25.         'description' => 'The machine-readable name for this example.',
  26.       ),
  27.     );
  30.     // The Map object contains all relevant identifiers - the machine name for
  31.     // the migration in question, the primary key for the source data and the
  32.     // primary key for the destination data, the latter two in Schema API
  33.     // format.
  34.     $this->map = new MigrateSQLMap($this->machineName,
  35.         $source_key,
  36.         MigrateDestinationExampleData::getKeySchema()
  37.       );
  39.     // In this example, we will be migrating from a database in the
  40.     // settings.php with the key saved on an example configuration page.
  41.     $key = variable_get('example_migration_key', 'example');
  42.     $connection = Database::getConnection('default', $key);
  44.     // Define the query which will be used by Migrate to retrieve all rows
  45.     // from the database.
  46.     $query = $connection->select('example_data', 'ed')
  47.       ->fields('ed');
  49.     // Define the source and destination for this migration. Both should be an
  50.     // object of a class which extends the respective base handler.
  51.     $this->source = new MigrateSourceSQL($query, array(), NULL, array('map_joinable' => FALSE));
  52.     $this->destination = new MigrateDestinationExampleData();
  54.     // Create the default field mappings. The user can change this at will on
  55.     // the migration page.
  56.     $this->addFieldMapping('example_machine_name', 'example_machine_name');
  57.     $this->addFieldMapping('example_value', 'example_value')
  58.          ->defaultValue(42);
  59.   }
  60. }

And that's it! Your code now should be able to migrate any amount of Example Data from one Drupal site to another. However, it's not appearing on the interface!

Registering and hierarchy

Before you are able to initiate a migration, you must register it with Drupal. Registration of migrations allows for all configuration to be completed before the user initiates the process. Prior to Migrate 2.6, migrations whose dependencies were met were automatically registered, however, due to popular demand, this is no longer the case. In Migrate 2.6, migrations are also drafted into groups to ease categorization. Adding groups is strongly recommended. To create a group and register a new migration, you must add the following code snippet (to, for example, your configuration form's submit callback):

  1. // When registering a migrate group, three parameters are expected:
  2. // A machine name for the group being registered, a human readable name
  3. // which will be displayed on the migration overview page and an array of
  4. // arguments, which apply to all migrations in the migration group.
  5. MigrateGroup::register('Example', 'Very Important Migration Of Example Data', array());
  7. // Registering a migration expects a class name (which is our migration
  8. // handler), a machine name, and similarly, arguments.
  9. // It should be noted that unless group_name is added to the arguments,
  10. // the migration will be added to the 'default' group.
  11. Migration::registerMigration('ExampleDataMigrateExampleMigration', 'ExampleData', array('description' => 'Migrate all example datas.', 'group_name' => 'Example'));

Registration of migrations can also be used to your advantage if building hierarchical import of unknown data. For example, let's assume that our example data values in fact define example sub-data. In this case, another migration destination and handler should be defined for the sub-data. But what if the import only makes sense after example data has been imported into the database? Migration handlers can be parametrized in the same way anything else can - any data passed to the arguments array of the register function will also be passed to the constructor of a migration handler. So, for each of the example data instances which were imported, an example sub-data migration can be defined, using the same handler with different parameters. An example code snippet on how to register such migrations follows. Note that the function in question is the complete method of our destination handler; as the complete method will run each time an item is imported, it is guaranteed to register the appropriate migration for every item.

  1. public function complete($imported, stdClass $row) {
  2.   // Retrieve the value of the example data that was imported.
  3.   $value = $imported['example_value'];
  4.   // Define the machine name for the new migration.
  5.   $machine_name = 'ExampleSubData' . $value; 
  6.   // Register the migration.
  7.   $class = 'ExampleSubDataMigrationClassName';
  8.   $args = array('value' => $value, 'group_name' => 'Example');
  9.   Migration::registerMigration($class, $machine_name, $args);
  10. }

That concludes this short tutorial on writing your own handlers for the Migrate API. It is an easy-to-use framework, available for a lot more purposes than it is given credit for. Hopefully, the upgrade path for Drupal 8 will use this module for a larger percentage of the data than the upgrade path for Drupal 7 did, as it would certainly allow upgrade jobs to be more streamlined and cleaner.

Related posts

2016. február 17.

Migrations are becoming a crucial part of Drupal 8, especially in connection with the upcoming Drupal 6 EOL. In this first article, we are going to show you how to quickly migrate your site to Drupal 8.