Updating nodes programmatically in Drupal 7

Revised 2015-05-29

See also Guide to programmatic node creation in Drupal 7.

Update (2015)

While the method described further down still works, a much better way of updating nodes is to use the entity metadata wrappers provided by the entity module. It’s much more clean and simple. Example:

$node = node_load($nid);
$node_wrapper = entity_metadata_wrapper('node', $node);
$node_wrapper->field_foo_year->set(2015);
$node_wrapper->save();

Please use entity metadata wrappers and skip the rest of this article unless you have a compelling reason not to.


Just like creating nodes, updating/editing a node programmatically in Drupal 7 is very simple. See the node creation page for guidance on how to set/update various field types (text, date, taxonomy, etc.); I won’t duplicate all that here. Instead I’ll show you additional things like deleting field data, deleting attachments and managing revisions.

I highly recommend using print_r(node_load($nid)) (or drush: drush php-eval 'print_r(node_load($nid))') to check nodes before and after modification; it’s a great way to learn about the data structures and see exactly what happens.

As always, don’t forget to make a backup of your database before you begin. If you have any questions, corrections or suggestions, please feel free to leave a comment.

Basic example

Put the code below in e.g. foo_update.php. Make sure it’s not accessible through the web (unless you don’t have shell access and must execute it through the browser and know what you are doing). In other words, put it outside your Drupal directory, and then just run php foo_update.php.

Also, just like when creating nodes, you have to make sure that the input is valid.

<?php
# Bootstrap start
define('DRUPAL_ROOT', '/path/to/your/drupal/root/directory');
$_SERVER['REMOTE_ADDR'] = "localhost"; // Necessary if running from command line
require_once DRUPAL_ROOT . '/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
# Bootstrap end

$nid = 453; // The node to update

$node = node_load($nid); // ...where $nid is the node id

$node->title    = "Let's set a new title for this node";
$node->body[$node->language][0]['value']   = "And a new body text, too.";

node_save($node);
echo "Node with nid " . $node->nid . " updated!\n";
?>

If you use drush, you can skip the bootstrap part (lines 3-6) and, from your Drupal root directory, run drush scr ../foo_update.php (assuming foo_update.php is one directory above).

Deleting field contents

To remove field data, I suggest using unset() like this:

// Delete field with index 0
unset($node->field_textfoo[$node->language][0]);
// Delete third field from a multi-value field
unset($node->field_textfoo[$node->language][2]);
// Delete whole field (if field is multi-value, all values are deleted)
unset($node->field_textfoo[$node->language]);

So, to empty a field, just use unset(). Often, however, you might want to do it conditionally. Say you have a term reference (taxonomy) field for tagging and want to remove a certain tag from a node. You can do it like this:

// Remove tag with taxonomy id 8
foreach ($node->field_tags[$node->language] as $index => $tag) {
    if ($tag['tid'] == 8) {
        unset($node->field_tags[$node->language][$index]);
   }
}

Deleting attached files

To delete an attached file from disk and delete its database record, use file_delete(): (Note: In Drupal 7, file_delete() takes a file object as an argument rather than a path; Drupal 6’s file_delete() is now file_unmanaged_delete().)

// Load file object
$foofile = file_load($node->field_image[$node->language][0]['fid']);
// Delete file if it's not being used anywhere
file_delete($foofile);
// ...or force deletion
file_delete($foofile, 1);

To remove a file that’s being used in the node, I’d do something like this:

$filefoo = file_load($node->field_image[$node->language][0]['fid']);
unset($node->field_image[$node->language][0]);
file_delete($filefoo);

If this doesn’t seem to work, it’s most likely due to a problem with permissions. Check directory permissions and make sure the user running the script is able to remove uploaded files.

If you want to delete an attached file but keep the actual file on the server, you might think you could just unset the field:

unset($node->field_image[$node->language][0]);

And you can. But the above will still keep the file’s database record - it’s removed from file_usage but will still be in file_managed. There doesn’t seem to be any existing API call we can use straight away. Taking a hint from the file_delete() code, you can do this:

$filefoo = file_load($node->field_image[$node->language][0]['fid']);
unset($node->field_image[$node->language][0]);
module_invoke_all('file_delete', $filefoo);
module_invoke_all('entity_delete', $filefoo, 'file');
db_delete('file_managed')->condition('fid', $filefoo->fid)->execute();

…or simply hack core. Here is another way to do it.

If you want to remove a file without touching the database (perhaps because the file is not in the database to begin with), you can use file_unmanaged_delete():

file_unmanaged_delete("public://field/image/foo.jpg");

Revisions

I highly recommend using revisions when updating stuff programmatically. If something goes seriously wrong, it’s much easier to roll back to a previous revision than fiddling with database backups.

To make a new revision, all you need to do is set $node->revision to 1, and optionally add a log message. Here’s the basic example again, this time with revisions enabled:

<?php
define('DRUPAL_ROOT', getcwd());
$_SERVER['REMOTE_ADDR'] = "localhost"; // Necessary if running from command line
require_once DRUPAL_ROOT . '/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

$nid = 453;

$node = node_load($nid); // ...where $nid is the node id

$node->title    = "Let's set a new title for this node";
$node->body[$node->language][0]['value']   = "And a new body text, too.";

$node->revision = 1; // Create new revision
$node->log = "Updated programmatically"; // Log message

node_save($node);
echo "Node with nid " . $node->nid . " updated!\n";
?>

If you do print_r(node_load($nid)); you’ll notice that the number of the latest revision is stored in $node->vid. The first time you create a node, $node->nid and $node->vid are the same. When you create a revision, $node->nid stays the same but $node->vid changes. Here’s how you load a specific revision of a node:

$node = node_load($nid, $vid);

You can get a list of a node’s revisions with node_revision_list($node). Note that this function takes a node object as its argument, so do something like this:

$node = node_load($nid);
$noderevisions = node_revision_list($node); // or node_revision_list(node_load($nid));
print_r($noderevisions);

Pathauto URL aliases

If you use pathauto, it will overwrite custom aliases unless you explicitly disable it with the following:

$node->path['pathauto'] = FALSE;

See How does Pathauto determine if the ‘Automatic URL alias’ checkbox should be checked or not? for more information.