We have moved to Github. Please open tickets there.

Opened 2 years ago

Closed 20 months ago

Last modified 19 months ago

#3396 closed Defect (fixed)

icon lookup fails to catch exceptions

Reported by: twall Owned by: cash
Priority: normal Milestone: Elgg 1.7.12
Component: Profile Version: 1.7
Severity: major Keywords:
Cc: brett@… Difficulty:

Description

Looking up an icon should not result in a fatal error.

Preferable behavior would be to use a default icon, or at worst, catch the exception somewhere in the call stack to prevent total meltdown.

The original cause of the filestore inconsistency had something to do with a failed user creation, but any filestore errors, regardless of cause, are going to trigger this type of failure.

This is particularly troublesome because user management is rendered non-functional with this error.

[25-Apr-2011 08:46:26] * FATAL EXCEPTION * : exception 'InvalidParameterException' with message 'File profile/3242small.png (file guid:0)
(owner guid:3242) is missing an owner!' in /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/filestore.php:251
Stack trace:
#0 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/filestore.php(262): ElggDiskFilestore->getFilenameOnFilestore(Object(ElggFile))
#1 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/filestore.php(628): ElggDiskFilestore->exists(Object(ElggFile))
#2 /home/bearp/svnroot/apps/elgg/trunk/html/mod/profile/start.php(289): ElggFile->exists()
#3 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/elgglib.php(877): profile_usericon_hook('entity:icon:url', 'user', false, Array)
#4 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/entities.php(2956): trigger_plugin_hook('entity:icon:url', 'user', Array, false)
#5 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/entities.php(730): get_entity_icon_url(Object(ElggUser), 'small')
#6 /home/bearp/svnroot/apps/elgg/trunk/html/mod/profile/views/default/profile/icon.php(76): ElggEntity->getIcon('small')
#7 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/views.php(245): include('/home/bearp/svn...')
#8 /home/bearp/svnroot/apps/elgg/trunk/html/mod/profile/views/default/profile/listing.php(16): elgg_view('profile/icon', Array)
#9 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/views.php(245): include('/home/bearp/svn...')
#10 /home/bearp/svnroot/apps/elgg/trunk/html/views/default/user/default.php(15): elgg_view('profile/listing', Array)
#11 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/views.php(245): include('/home/bearp/svn...')
#12 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/views.php(599): elgg_view('user/default', Array, true, false)
#13 /home/bearp/svnroot/apps/elgg/trunk/html/views/default/entities/entity_list.php(51): elgg_view_entity(Object(ElggUser), false)
#14 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/views.php(245): include('/home/bearp/svn...')
#15 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/views.php(693): elgg_view('entities/entity...', Array)
#16 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/entities.php(2333): elgg_view_entity_list(Array, 4, 0, 10, false, false, true)
#17 /home/bearp/svnroot/apps/elgg/trunk/html/admin/user.php(26): elgg_list_entities(Array)
#18 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/admin.php(109): include('/home/bearp/svn...')
#19 /home/bearp/svnroot/apps/elgg/trunk/html/engine/lib/pagehandler.php(33): admin_settings_page_handler(Array, 'admin')
#20 /home/bearp/svnroot/apps/elgg/trunk/html/engine/handlers/pagehandler.php(20): page_handler('admin', 'user/')
#21 {main}

Change History (29)

comment:1 Changed 2 years ago by twall

  • Component changed from Core to Profile

On closer inspection this looks like it should probably apply to mod/profile; it looks like profile_usericon_hook is what should be catching the exception.

comment:2 Changed 2 years ago by cash

  • Milestone changed from Needs Review to Elgg 1.7.9

comment:3 Changed 2 years ago by brettp

  • Resolution set to wontfix
  • Status changed from new to closed

If the icon is throwing that exception because it's missing an owner it means the DB is corrupt and user management wouldn't work for that user anyway. Not dealing well with incomplete tables is a larger problem with Elgg, and I don't think it can be solved here.

Cash wrote a plugin that checks your database for these issues. Might want to give that a try.

comment:4 Changed 2 years ago by twall

So the general elgg policy is to throw exceptions but not catch them?

Or just not catch them if you estimate that the only reason the exceptions would ever be thrown would be because of a fatal error?

Catching the exception doesn't mean you're out to fix the db error; it means you will have a functional interface instead of a blank screen or an exception message.

I would think that anytime you call code that throws exceptions you should at least prevent the exception from leaving you with a non-functional interface.

comment:5 Changed 2 years ago by brettp

I said this isn't something we can fix in a single spot because it's not a simple fix.

Right now the exception is telling you that your database is corrupt and you need to fix it. If we hide the exception in a few cases, it will appear to work on pages A and B but will fail on pages C - Z. Catching the exception doesn't provide a functional interface--the database is still corrupt and Elgg won't be able to handle partial records.

I'm not saying we don't plan to fix it, I'm saying we can't fix it in this single case for the 1.7.9 release because the problem is larger. The solution we have now (fix corrupted databases) is the safest.

comment:6 Changed 2 years ago by cash

A production system should be able to handle expected failures and degrade gracefully. It doesn't look good for Elgg if a single record becomes corrupted and an entire section of the site fails.

comment:7 Changed 2 years ago by brettp

I agree it needs to be fixed, but I think patching the icon to hide this exception for this one case will create more problems than it will solve because it will present functional-looking pages that are broken and other pages will continue to be broken. I don't think this is a 1.7.X fix.

comment:8 Changed 2 years ago by twall

You shouldn't hide the exception. In this case, you have a fairly obvious way of showing the presence of an error -- use a fallback icon that indicates the icon could not be loaded.

Looking at the code, there are a number of exceptions that could potentially be thrown for a number of different underlying reasons. Accessing an icon is usually not a critical task, so it seems reasonable to stop exception propagation at the point where you fail to get the icon, rather than letting it percolate all the way out.

IIRC, the object in question could never be displayed because of the broken icon (including management pages allowing easy deletion of the record). The single record defect pretty much made the whole site unusable.

comment:9 Changed 2 years ago by cash

  1. Should the capability to fix the problem be included in core?
  1. If so, we display the question mark icon with a tooltip that says your database is corrupted. Please run the database cleaner. Even if it is not, maybe display the ? icon with link to documentation.

Given how close we are to 1.7.9, I think we should reopen and look at fixing for 1.7.10.

comment:10 Changed 2 years ago by brettp

  • Milestone changed from Elgg 1.7.9 to Elgg 1.7.10
  • Resolution wontfix deleted
  • Status changed from closed to reopened

Reopening for 1.7.10 for further discussion. To answer the questions:

  1. I'm fine with including DB validation / repair in core.
  2. There are more places than icons that will need to catch the exception.

I'm uneasy at the idea of changing a "FIX YOUR DATABASE" fatal error to an "Eh, I'm sorta gonna work sometimes but you might want to deal with this eventually" tooltip. I think a real fix is deeper than that.

comment:11 follow-up: Changed 2 years ago by cash

Fine. We'll add a flaming icon gif that says your site is going down in flames if you don't fix your database :)

The issue is making the site unusable when it does not need to be.

comment:12 in reply to: ↑ 11 ; follow-up: Changed 2 years ago by brettp

Replying to cash:

The issue is making the site unusable when it does not need to be.

Yes, I understand this. My point is we're applying bandaids that makes the site usable on certain pages but not on others. I don't see this as a better solution than "Fix your database" right now.

comment:13 follow-up: Changed 2 years ago by twall

Well, it *is* better in that it allows you to display the page that gives you the option to "delete this object", which in this case *is* the "fix your database" solution.

Order of magnitude difference from manually inserting the code invoke delete via guid or browsing through phpmyadmin.

comment:14 in reply to: ↑ 12 Changed 2 years ago by cash

Replying to brettp:

Yes, I understand this. My point is we're applying bandaids that makes the site usable on certain pages but not on others. I don't see this as a better solution than "Fix your database" right now.

If this ever happened on our site, our site admins would completely panic. They are not developers and now they are a broken site with some cryptic message (that does not say fix your database. I'm not sure where you are getting the idea that it says that.). I really fail to see how this is a better solution than catching the exception and giving people a usable site on that page. We're obviously not going to solve the underlying issue in 1.7.x. Do we want to be taken seriously as production ready software? If so, we start applying band-aids where we can.

comment:15 in reply to: ↑ 13 Changed 2 years ago by brettp

Well, it *is* better in that it allows you to display the page that gives you the option to "delete this object", which in this case *is* the "fix your database" solution.

This is exactly my point, catching the exception DOES NOT allow you to delete the object because the object is corrupted and Elgg can't delete it.

They are not developers and now they are a broken site with some cryptic message (that does not say fix your database. I'm
not sure where you are getting the idea that it says that.).

Then we change the language string. If someone asked you what that cryptic message meant what would you tell them? That's where I'm getting it.

We're obviously not going to solve the underlying issue in 1.7.x.

This is another point I've been trying to make and the reason I closed the ticket to start with.

Do we want to be taken seriously as production ready software?

This is a pointless question to ask, and saying yes doesn't mean it's time to haphazardly apply bandaids instead of producing an actual fix. This is how many of the problems originated in Elgg's early development.

By now I don't care either way. It's not worth wasting time bickering over if someone wants to put try/catch throughout the plugins because there are more important things to address. This is already on 1.7.10's milestone so if anyone wants to address it, feel free.

comment:16 Changed 2 years ago by cash

Most sites do not have a developer onsite to troubleshoot the internal workings of Elgg. We should not require that. Right now with this issue we do. We present the cryptic error message and the solutions are to manually delete the entity in the database using phpmyadmin/mysql client or use a plugin that most admins do not know exists.

This happens every few weeks on the community site (#2707). I fix it through the API. What I do is go to someone's page, use Firebug to edit the GUID on the delete button, and then delete the corrupted user. Seems to work just fine.

I believe only a single try/catch is needed to handle this. I have only seen the error on pages that list users (the members list and the admin user administration page). The root cause is trying to display an icon for a corrupted user. The combination of a single try/catch with a delete button should be enough.

comment:17 Changed 2 years ago by brettp

This happens every few weeks on the community site (#2707). I fix it through the API. What I do is go to someone's page, use
Firebug to edit the GUID on the delete button, and then delete the corrupted user. Seems to work just fine.

It failed on my test because the user had ElggFile objects and it can't cascade through them. This goes back to the "sometimes it will work" point I made.

This discussion is going in circles. As I said, I agree there is a problem but don't think the best solution is patching the symptoms. If you want to add try/catch to the plugins, have at it.

comment:18 Changed 2 years ago by cash

  • Owner set to cash
  • Status changed from reopened to new

comment:19 Changed 2 years ago by cash

  • Milestone changed from Elgg 1.7.10 to Elgg 1.7.11

comment:20 Changed 22 months ago by brettp

  • Milestone changed from Elgg 1.7.11 to Elgg 1.7.12

Since Cash is MIA and I don't know what he had planned for this one I'm pushing it to 1.7.12.

comment:21 Changed 21 months ago by cash

I'll take a look at this ticket this week.

comment:22 Changed 20 months ago by cash

  • Resolution set to fixed
  • Status changed from new to closed

Fixes #3396 catching exception with a bad user when getting profile icon

Changeset: 020a85e85645322b09dc1179ac90f32b246a4eb0

comment:23 Changed 20 months ago by cash

I'm catching the exception on the profile icon request and using a ? icon. This lets the admin know something is wrong without taking the site down. I'm also logging the error using elgg_log() with a more descriptive error message.

comment:24 Changed 20 months ago by brettp

This looks like it's not an issue in 1.8's avatar handling because we don't use ElggFile so it's not been ported to master.

comment:25 Changed 20 months ago by cash

pages/avatar/view.php

comment:26 Changed 20 months ago by Brett Profitt

Refs #3396. Added try / catch for avatar icon.

Changeset: 34a7e57ef205362d6c093fa7234ca38977560bea

comment:27 Changed 19 months ago by cash

Refs #3396 fixed the integration of this 1.7 fix into 1.8 and also added better expires header

Changeset: 3ef53c9c97ec7f8e7022ca357a211531bde983c7

comment:28 Changed 19 months ago by Brett Profitt

Refs #3396. Added try / catch for avatar icon.

Changeset: 34a7e57ef205362d6c093fa7234ca38977560bea

comment:29 Changed 19 months ago by cash

Refs #3396 fixed the integration of this 1.7 fix into 1.8 and also added better expires header

Changeset: 3ef53c9c97ec7f8e7022ca357a211531bde983c7

Note: See TracTickets for help on using tickets.