Opened 17 months ago
Last modified 11 months ago
#4268 new Defect
Updating metadata is a non thread-safe operation
| Reported by: | aszepeshazi | Owned by: | |
|---|---|---|---|
| Priority: | normal | Milestone: | Long Term Discussion |
| Component: | Core | Version: | 1.8 |
| Severity: | major | Keywords: | |
| Cc: | brett@… | Difficulty: | moderate |
Description (last modified by ewinslow)
When two or more parallel threads try to update the same metadata on the same object, the results are unpredictable.
To reproduce:
- Go to https://gist.github.com/1556812 , download the "parallel_test.php" script and drop it into your Elgg installation's root directory.
- Hit http://your-elgg-site/parallel_test.php in your browser a couple of times, and watch the php error log. A counter should increase one by one, showing how many times the object metadata was updated.
- To test the behavior of parallel threads, use apache benchmark. In a console, run "ab -n 100 -c 10 http://your-elgg-site/parallel_test.php" and watch the log. You'll see at some point the single integer type property turning into an array.
Cash's remarks:
"I believe this is a race condition - 2 PHP processes are creating the metadata at the same time. If we didn't support arrays, we could put a unique key on entity_guid and metadata name and let the database catch this. Because we do allow arrays, we try to catch this in PHP which creates the race condition."
Change History (8)
comment:1 follow-up: ↓ 2 Changed 15 months ago by beck24
comment:2 in reply to: ↑ 1 Changed 15 months ago by beck24
Replying to beck24:
Subscribing - I just found this happening on a live environment after debugging some strange behavior on a clients site. A piece of metadata for one user had transformed from a string to an array.
I should add that this site is running 1.7, but I don't think it's a surprise that this would affect both versions.
comment:3 Changed 12 months ago by ewinslow
- Description modified (diff)
- Difficulty set to moderate
- Milestone changed from Needs Review to Long Term Discussion
I'm not sure there's a good way to fix this. Perhaps if we could specify that certain metadata are always strings and others are always arrays? That wouldn't really get rid of the race condition, but it could gloss over it to the point where no one would know the difference.
comment:4 Changed 12 months ago by brettp
The real solution is DB transactions. I don't think we can reasonably predict which MD should be strings or arrays. A registration function for MD would solve that, but also makes it much more complicated.
comment:5 Changed 12 months ago by srokap
I'm currently experimenting with introducing transactions to elgg installation. Of course conversion to InnoDB is necessary and only 4 tables have fulltext indexes:
'groups_entity', 'objects_entity', 'sites_entity', 'users_entity'
As i've noticed so far, nothing terrible happens in this situation. There are orphaned entries in these tables after rollback.
My suggestion is to consider introducing transactions to Elgg in rather near future? My first experiments with rolling back extensive, recursive changes to entities metadata work without problems. Also basic tests with new entities work pretty well.
I see four paths to follow here:
- Live with 4 MyISAM tables and don't bother too much about small inconsistencies and consider some mechanism to fix orphaned data (rollback of deletion of entities seems tricky here)
- Wait for MySQL 5.6 and pretend that being thread-safe is not so important.
- Introduce transactions only for metadata related changes (I don't see any blocker here)
- Convert all MyISAM tables to InnoDB, drop current fulltext searches and think of separate table(s) used for fulltext searches. We could only add data and drop orphaned data in cron, it would reliable if we read it while joining with entities table. I don't see any other use tor fulltext search than.. searching entities, so such approach would do the trick.
Personally I'd love to see elgg following 4rd option, but I also think the 3rd option is the least we can do now.
What do you think?
comment:6 Changed 11 months ago by srokap
There's related discussion here: https://docs.google.com/document/d/1NrxIj4YOTjNbeXDGW3tpz2lNvaRwL2NDPBNd7TgRfFk/edit?disco=AAAAAEr6svk
comment:7 Changed 11 months ago by daniel
I've come across this issue while doing some stress testing. Updating the same metadata on a single user object with 100+ concurrency makes the metadata unpredictable. And when the concurrency increases (like 200+ and runs for 5 minutes) I can see tens of thousands record inserted in metadata table.
After that if you try to fetch that metadata (same name with around 10000+ rows of value), php will fall showing something like 'allocated memory * bytes exhausted', thus crush the whole site.
For now I'd say first we should avoid updating the same metadata concurrently if it's possible. I use this code below as a temporary solution, adding a where not exists clause to avoid multiple records.
/**
* Create a new metadata object, or update an existing one.
*
* Metadata can be an array by setting allow_multiple to TRUE, but it is an
* indexed array with no control over the indexing.
*
* @param int $entity_guid The entity to attach the metadata to
* @param string $name Name of the metadata
* @param string $value Value of the metadata
* @param string $value_type 'text', 'integer', or '' for automatic detection
* @param int $owner_guid GUID of entity that owns the metadata
* @param int $access_id Default is ACCESS_PRIVATE
*
* @return int/bool id of metadata or FALSE if failure
*/
function create_single_metadata($entity_guid, $name, $value, $value_type = '', $owner_guid = 0,
$access_id = ACCESS_PRIVATE) {
global $CONFIG;
$entity_guid = (int)$entity_guid;
// name and value are encoded in add_metastring()
//$name = sanitise_string(trim($name));
//$value = sanitise_string(trim($value));
$value_type = detect_extender_valuetype($value, sanitise_string(trim($value_type)));
$time = time();
$owner_guid = (int)$owner_guid;
if (!isset($value)) {
return FALSE;
}
if ($owner_guid == 0) {
$owner_guid = elgg_get_logged_in_user_guid();
}
$access_id = (int)$access_id;
$id = false;
$query = "SELECT * from {$CONFIG->dbprefix}metadata"
. " WHERE entity_guid = $entity_guid and name_id=" . add_metastring($name) . " limit 1";
$existing = get_data_row($query);
if ($existing) {
$id = (int)$existing->id;
$result = update_metadata($id, $name, $value, $value_type, $owner_guid, $access_id);
if (!$result) {
return false;
}
} else {
// Support boolean types
if (is_bool($value)) {
if ($value) {
$value = 1;
} else {
$value = 0;
}
}
// Add the metastrings
$value = add_metastring($value);
if (!$value) {
return false;
}
$name = add_metastring($name);
if (!$name) {
return false;
}
// If ok then add it
$query = "INSERT into {$CONFIG->dbprefix}metadata"
. " (entity_guid, name_id, value_id, value_type, owner_guid, time_created, access_id)"
. " SELECT $entity_guid, $name,$value,'$value_type', $owner_guid, $time, $access_id FROM dual"
. " WHERE NOT EXISTS (SELECT * from {$CONFIG->dbprefix}metadata"
. " WHERE entity_guid = $entity_guid and name_id=" . $name . " limit 1)";
$id = insert_data($query);
if ($id !== false) {
$obj = elgg_get_metadata_from_id($id);
if (elgg_trigger_event('create', 'metadata', $obj)) {
return $id;
} else {
elgg_delete_metadata_by_id($id);
}
}
}
return $id;
}
Another thing is that we also need to constrain the array size for a single metadata, from my experiment if the rows become unreasonably large, it will be dangerous to access it via magic methods like ->.
comment:8 Changed 11 months ago by daniel
After some more research I found it might not be the problem of create_metadata.
In ElggEntity.php we have
public function setMetaData(***
...
...
if (!$multiple) {
$options = array(
'guid' => $this->getGUID(),
'metadata_name' => $name,
'limit' => 0
);
elgg_delete_metadata($options);
}
...
...
create_metadata(***)
I guess the problem here is that when one thread trying to set a metadata named 'test', it will try to delete it first, then insert new record. Meanwhile another thread trying to set the same metadata 'test', it will find empty record due to the deletion, so it also inserts a new record - that's probably how the race condition comes in.
Actually in create_metadata it will try to update the existing record first, I think there is no need to delete the record in setMetadata. Removing the elgg_delete_metadata() code block in ElggEntity solves the problem in my experiment.
The test code I use is like this:
<?php //test.php require_once(dirname(__FILE__) . "/engine/start.php"); // try to get a valid user entity here $user=get_user(35); login($user); //echo create_metadata($user->guid,'testmeta','abcdabcdabcd','',$user->guid); $user->testmeta = 'abcdabcdabcd'; logout();
put this in elgg's root dir, and use ab to test:
ab -n 1000 -c 100 http://your-elgg-site/test.php
soon we can see multiple records filled up in metadata table.
Use create_metadata however doesn't reproduce the problem.
Hope this helps.

Subscribing - I just found this happening on a live environment after debugging some strange behavior on a clients site. A piece of metadata for one user had transformed from a string to an array.