[php-maint] Timezone querying

sean finney seanius at debian.org
Thu Mar 18 08:16:58 UTC 2010


hi all:

On Wed, Mar 17, 2010 at 04:17:13PM -0400, Filipus Klutiero wrote:
> at the end of February, PHP 5.3 came to me, with an expected number of issues. One issue that annoyed me was that a warning is now triggered if a time-related function is used without having set the timezone in 3 approved ways: the TZ environment variable, the date.timezone PHP directive or calling date_default_timezone_set(). Thankfully, Sean Finney made Debian's PHP quiet this warning - so PHP now relies on the system timezone (PHP was already able to use that as a last recourse, but considered it unreliable, for an obscure reason).

just for the record, i haven't changed any of the logic about what order
or behavior goes on underneath the hood, i've only shut up the warning :)

> I don't know for sure that baseline PHP is affected as badly, but it seems so. I seem to remember the issue is that each determination of the timezone causes a syscall - not sure again. It would be nice if somebody could check this...
> The application in question is called Twit, and I don't know what it is except it was written by Rasmus: http://www.ora600.be/news/liveblogging-confoo-not-just-php-performance-rasmus-lerdorf
> But many applications use time extensively, so I have no trouble believing this case is not an exception.

care to do a benchmark for us?  i'd also be interested to see how the
timezone-patched php stacks up against the vanilla php embedded tz.
the patched version may do more file i/o but since it's read only
and the same couple dozen files in /usr/share/zoneinfo every time,
the OS is probably doing us some favors.

i think it would be very unwise to try make "speedup fixes" to this stuff
without first understanding if/where the performance hit is occurring.
knuth's "premature optimization" quote comes to mind...

On Wed, Mar 17, 2010 at 10:00:59PM +0100, Ondřej Surý wrote:
> 1. Added overhead when timezone is not set -> I don't think we should
> really do anything in here. It's a simple matter of adding
> date_default_timezone_set("xxx/yyy") to your application. After adding
> this call PHP will use that value as first in guess_timezone().
> However we could put something about it to README.Debian.

What if we set the default ini configured timezone to "System/Localtime"?
Does that remove the guessing logic or just invoke the default guessing
pattern?

> 2. We call stat() everytime timelib_timezone_id_is_valid() is called.
> That is adds some slowdown, but I don't really see a way how to fix
> that unless you readahead all timezones, which will make PHP even more
> slower (mainly at startup, but that happens basically every time with
> non-threaded model). But I guess we could improve the performance
> little bit anyway - using snprintfs are somewhat suboptimal and some
> string related functions could be improved as well. I'll do that.

just fyi, after the last version of joe orton's tzdata patch, i tried making
various improvements (using mmapped regions instead of stdio file i/o, etc)
but even with really ridiculous test cases the difference in startup time
were negligable or non-existant.  so i think we should verify that there
is a runtime performance issue we can and want to improve first.

On Wed, Mar 17, 2010 at 10:50:21PM +0100, Philipp Kempgen wrote:
> In order to write portable code you need
> date_default_timezone_set( @date_default_timezone_get());
> anyway no matter whether Debian's PHP actually triggers a notice/
> warning or not.

sure, but i guess that's a bit out of scope apart from being an argument
for not spending a great deal of time worrying about the default tz :)

> Note that specifying a literal timezone in
> date_default_timezone_set('xxx/yyy') will never happen in a portable
> library.

but could definitely be found in an application.

> I don't know what Debian's PHP currently does but if it guesses
> the system's timezone on every invocation of date() (vs. once per
> interpreter invocation) and thus causes a slowdown without triggering
> a warning then that might even be worse than the strange default
> behavior from upstream.

maybe you should take a look at the (patched) source to see what it
does? :)
 
> Ugly alternative solution: How about a daily cron job that sets
> date.timezone (in /etc/php5/*/php.ini) to
> @date_default_timezone_get() resp. /etc/timezone ?
> (No, I'm not saying this is the way to go. Just an idea.)

that would be a Horrible-with-a-capital-H idea.  We're not supposed to
play with the local config files like that.  But maybe "System/Localtime"
as mentioned above might fix this?  I haven't tried using it myself.

On Wed, Mar 17, 2010 at 05:46:29PM -0600, Raphael Geissert wrote:
> > at the end of February, PHP 5.3 came to me, with an expected number of issues. One issue that annoyed me was that a warning is now triggered if a time-related function is used without having set the timezone in 3 approved ways: the TZ environment variable, the date.timezone PHP directive or calling date_default_timezone_set(). Thankfully, Sean Finney made Debian's PHP quiet this warning - so PHP now relies on the system timezone (PHP was already able to use that as a last recourse, but considered it unreliable, for an obscure reason).
> >
> 
> That actually needs to be fixed, because the results are actually
> incorrect (in spite of the tz foo patch):
> 
> $ cat /etc/timezone
> America/Mexico_City
> $ php -nr 'var_dump(date_default_timezone_get());'
> string(15) "America/Chicago"

out of curiosity, if you set it to use "System/Localtime" in php.ini,
does it take the correct one?

> What _is_ a problem is that before that's done, the
> /usr/share/zoneinfo directory (and contents) is stat()ed followed by
> lots of string comparisons which in total take some time. What does

i believe a call to localtime_r will result in the zone files being
read, even.  in #535770 i discussed with joe about the necessity of
this as well as some ideas for speeding it up, but when i looked into
doing so myself i couldn't produce a reasonable/notable speedup, at
least without needing to make some rather drastic changes.

> seem pretty inefficient is this:
>   0.000169 strcasecmp("utc", "CST")
>                 = 18 <0.000047>
<snip>
> <lots more skipped. Yes, comparing the same strings multiple times>

i think that's probably be find_zone_info, i.e. it's trying to find
the zone file with the matching name, while going through a hashtable
structure.  so the strings have the same value but they are likely not
the same string :)  that could probably be fixed with a secondary
hash table keyed on the tz name, i guess.

> Talking about performance (and efficiency), running a one-liner
> var_dump(''); with -n, shows this:
> $ grep -c ^free /tmp/trace
> 13511
> yep, 13k calls to free() in a very simple script.

all that stuff on the heap and still they still refuse to
move the core data structures off of the stack (remember
all those "Bogus" bugs about php, deep recursion, and stack
overflows?).  lolz.

On Thu, Mar 18, 2010 at 01:06:34AM +0100, Ondřej Surý wrote:
> > What _is_ a problem is that before that's done, the
> > /usr/share/zoneinfo directory (and contents) is stat()ed followed by
> > lots of string comparisons which in total take some time.
> 
> I guess it would be best to build something like a decision tree
> there, and cache it somewhere to /var/lib/php5/ and do a maintainance
> when tz database changes (or every hour or so).

one idea i mentioned in #535770 was to build a cached datastructure,
similar to the one used in the php-embedded tzdata, which we could
drop in /var/cache/php5/somewhere.  the datastructure could have the
combined zone data as well as the default tz pre-selected in an easy
to read (for the interpreter) format.

i don't think we'd need a cronjob for this, we could just put a dpkg
trigger on the tzdata files so that whenever they're updated the cache
file is rebuilt.

but at the risk of sounding like a broken record, before we put any
serious effort into this, shouldn't we actually verify that there's
a performance *problem*?

also, i would really want to sync this with joe orton @rh, as I'd rather
not have to maintain a "customized debian version of the customized
redhat tzdata patch".


woo... made it through the thread!

	sean

-- 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-php-maint/attachments/20100318/0e9a21d7/attachment.pgp>


More information about the pkg-php-maint mailing list