Commit Graph

4813 Commits

Author SHA1 Message Date
Tim Abbott
10e7e15088 user_agent: Compile the regular expression.
We use this single regular expression for processing essentially every
request, so it's definitely worth hinting to Python that we're going
to do so by compiling it.  Saves about 40us per request.
2020-02-14 10:26:37 -08:00
Tim Abbott
800312c976 has_request_variables: Fix slow extraction of parameters.
A sloppy implementation of the main has_request_variables wrapper
function meant that it did two very inefficient things:

* To combine together the GET and POST parameters, it would make a
  copy of the request.GET QueryDict object, which combined with the
  fact that these objects are slow to access, consumed about 90us per
  argument.
* Doing this in a loop (one time per argument), rather than once,
  which resulted in us doing this 11 times for a `GET /events` query.

Fixing this to just make a dictionary and combine things with some
small loops saved about 1 millisecond from the total runtime of GET
/events (for comparison, the total actual work of that view function
is about 700ms).

We need to fix at least one test that used a bad mock HttpRequest
object that didn't have a .GET property.
2020-02-14 09:45:26 -08:00
Tim Abbott
4fbcbeeea7 settings: Disable django.request logging at WARNING log level.
The comment explains this issue, but effectively, the upgrade to
Django 2.x means that Django's built-in django.request logger was
writing to our errors logs WARNING-level data for every 404 and 400
error.  We don't consider user errors to be a problem worth
highlighting in that log file.
2020-02-13 23:50:53 -08:00
rht
41e3db81be dependencies: Upgrade to Django 2.2.10.
Django 2.2.x is the next LTS release after Django 1.11.x; I expect
we'll be on it for a while, as Django 3.x won't have an LTS release
series out for a while.

Because of upstream API changes in Django, this commit includes
several changes beyond requirements and:

* urls: django.urls.resolvers.RegexURLPattern has been replaced by
  django.urls.resolvers.URLPattern; affects OpenAPI code and related
  features which re-parse Django's internals.
  https://code.djangoproject.com/ticket/28593
* test_runner: Change number to suffix. Django changed the name in this
  ticket: https://code.djangoproject.com/ticket/28578
* Delete now-unnecessary SameSite cookie code (it's now the default).
* forms: urlsafe_base64_encode returns string in Django 2.2.
  https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode
* upload: Django's File.size property replaces _get_size().
  https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/
* process_queue: Migrate to new autoreload API.
* test_messages: Add an extra query caused by .refresh_from_db() losing
  the .select_related() on the Realm object.
* session: Sync SessionHostDomainMiddleware with Django 2.2.

There's a lot more we can do to take advantage of the new release;
this is tracked in #11341.

Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed
are squashed into this commit.

Fixes #10835.
2020-02-13 16:27:26 -08:00
Tim Abbott
1ea2f188ce tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.

The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).

As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
  to change a couple lines allowing us our Tornado code to not
  actually return the Django HttpResponse so we can long-poll.  The
  previous hack of returning None stopped being viable with the Django 2.2
  MiddlewareMixin.__call__ implementation.

But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:

* Replace RespondAsynchronously with a response.asynchronous attribute
  on the HttpResponse; this allows Django to run its normal plumbing
  happily in a way that should be stable over time, and then we
  proceed to discard the response inside the Tornado `get()` method to
  implement long-polling.  (Better yet might be raising an
  exception?).  This lets us eliminate maintaining a patched copy of
  _get_response.

* Removing the @asynchronous decorator, which didn't add anything now
  that we only have one API endpoint backend (with two frontend call
  points) that could call into this.  Combined with the last bullet,
  this lets us remove a significant hack from our
  never_cache_responses function.

* Calling the normal Django `get_response` method from zulip_finish
  after creating a duplicate request to process, rather than writing
  totally custom code to do that.  This lets us eliminate maintaining
  a patched copy of Django's load_middleware.

* Adding detailed comments explaining how this is supposed to work,
  what problems we encounter, and how we solve various problems, which
  is critical to being able to modify this code in the future.

A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.

There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard).  We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response.  Profiling will be important to understanding what's
worth doing here.
2020-02-13 16:13:11 -08:00
Mateusz Mandera
27b15a9722 install: Don't create internal realm in the installation process. 2020-02-12 12:00:10 -08:00
Mateusz Mandera
fe33966642 sessions: Implement the concept of expirable session variables.
This can be useful in the future for various things, and right now it'll
specifically be used in the signup mobile/desktop flows.
2020-02-12 11:09:55 -08:00
Hashir Sarwar
eb23c6fa6c test_fixtures: Clean up interface for template_database_status().
1) Created a new class `DatabaseType` and access its objects inside
`template_database_status()` instead of sending five arguments with
default values.

2) Made `check_files` and `setting_name` local variables instead of
function parameters since they had same value(None) for every call.

Fixes #13845.
2020-02-12 11:07:10 -08:00
Tim Abbott
96b0ec705d email_notifications: Fix missing translation tags on sender. 2020-02-12 10:54:34 -08:00
Anders Kaseorg
e257253e64 emoji_codes: Replace JS module with JSON module.
webpack optimizes JSON modules using JSON.parse("{…}"), which is
faster than the normal JavaScript parser.

Update the backend to use emoji_codes.json too instead of the three
separate JSON files.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-02-12 10:09:12 -08:00
Tim Abbott
cb2c96f736 test_templates: Remove shallow template rendering code.
This code was very useful when first implemented to help catch errors
where our backend templates didn't render, but has been superceded by
the success of our URL coverage testing (which ensures every URL
supported by Zulip's urls.py is accessed by our tests, with a few
exceptions) and other tests covering all of the emails Zulip sends.

It has a significant maintenance cost because it's a bit hacky and
involves generating fake context, so it makes sense to remove these.
Any future coverage issues with templates should be addressed with a
direct test that just accessing the relevant URL or sends the relevant
email.
2020-02-11 18:00:15 -08:00
Mateusz Mandera
2475adbf8a messages_for_topic: Use stream.recipient_id for more efficient query. 2020-02-11 17:39:43 -08:00
Steve Howell
900f98c0c5 presence: Use realm_id for UserPresence queries.
We now use realm_id for querying UserPresence
instead of building a big WHERE clause from the
list of user_ids.

This commit may be a bit hard to measure, since
we still get the list of user_ids for the PushToken
query in the same method.
2020-02-11 13:11:58 -08:00
Tim Abbott
fcac3a4342 recipients: Rename extract_recipients to extract_private_recipients.
Recent changes mean this function is now only used for private
messages.
2020-02-11 12:28:14 -08:00
Steve Howell
1b6578cafd messages: Fix bug with commas in stream names.
We now validate streams with a separate
function from PM recipients.

It's confusing enough all the ways you can
encode a stream or encode the PM recipients,
but trying to do it all in one function was
hard to reason about and led to at least one
bug.

In particular, there was a bug where streams
with commas in them would get split.  Now
we just don't ever split on commas inside
of `extract_stream_indicator`.

Fixes #13836
2020-02-11 12:20:54 -08:00
Steve Howell
96132fe0e9 extract_recipients: Enforce str as incoming type.
After removing internal_send_message() in a recent
commit, we now have only two callers for
extract_recipients, and they are both related
to our REQ mechanism that always passes strings
to converters.  (If there are default values,
REQ does not call the converters.)

We therefore make two changes:

    - use the more strict annotation of "str"
      for the `s` parameter

    - don't bother with the isinstance check
2020-02-11 12:20:54 -08:00
Steve Howell
8c3eaeb872 Remove obsolete internal_send_messages().
We have been phasing this out for a couple years,
and I fixed the last stragglers over the last
couple days.
2020-02-11 12:20:54 -08:00
Steve Howell
e37d660d19 error_notify: Use internal_send_stream_message(). 2020-02-11 12:20:53 -08:00
Steve Howell
c4e3cfebb0 presence: Add realm_id to UserPresence.
This index is intended to optimize the performance of the very
frequently run query of "what is the presence status of all users in a
realm?".

Main changes:
    - add realm_id to UserPresence
    - add index for realm_id
    - backfill realm_id for old rows
    - change all writes to UserPresence to include
      realm_id

The index is of this form:

    "zerver_userpresence_realm_id_5c4ef5a9" btree (realm_id)

We will create an index on (realm_id, timestamp) in a
future commit, but I think it's a bit faster if you do
the backfill before the index.

There's also a minor tweak to the populate_db script.
2020-02-10 17:21:45 -08:00
Steve Howell
28a8ffbc4c email_mirror: Use internal_send_stream_message().
This is just a refactoring to the more modern API
for sending internal messages.

To make this work we now plumb the email_gateway
flag through `internal_send_stream_message` instead
of `internal_send_message`.

We also change `send_zulip` to have its callers
pass in a full UserProfile object (which one of
them already had).
2020-02-10 15:45:13 -08:00
Steve Howell
6922eef380 signups: Use internal_send_stream_message().
We prefer this to internal_send_message().

We are trying to deprecate `internal_send_message`,
which has extra moving parts related to
`extract_recipients` and `Addressee.legacy_build`.

There are two chunks of code that I touch here
that look pretty similar, but I'm not quite
sure they're worth de-duplicating, since they
use different topics and different message
content.
2020-02-10 15:45:13 -08:00
Steve Howell
b33552997e cross realm bots: Simplify notify_new_user.
Instead of having `notify_new_user` delegate
all the heavy lifting to `send_signup_message`,
we just rename `send_signup_message` to be
`notify_new_user` and remove the one-line
wrapper.

We remove a lot of obsolete complexity:

    - `internal` was no longer ever set to True
      by real code, so we kill it off as well
      as well as killing off the internal_blurb code
      and the now-obsolete test

    - the `sender` parameter was actually an
      email, not a UserProfile, but I think
      that got past mypy due to the caller
      passing in something from settings.py

    - we were only passing in NOTIFICATION_BOT
      for the sender, so we just hard code
      that now

    - we eliminate the verbose
      `admin_realm_signup_notifications_stream`
      parameter and just hard code it to
      "signups"

    - we weren't using the optional realm
      parameter

There's also a long ugly comment in
`get_recipient_info` related to this code
that I amended for now.
We should try to take action in a subsequent
commit.
2020-02-10 15:45:13 -08:00
Hashir Sarwar
dcbd3e486f stream_subscription: Remove unused TypedDict SubInfo. 2020-02-10 14:04:22 -08:00
Steve Howell
2ff41bf9e5 /json/users: Use field.realm for realm lookup.
This avoids an unnecessary join to UserProfile.

To verify this, you can do `print(queries)` in the
`test_get_custom_profile_fields_from_api` test.  It's
kinda noisy, so I excerpted them below...

Before:

    SELECT ...
    FROM "zerver_customprofilefieldvalue"
    INNER JOIN "zerver_userprofile" ON ("zerver_customprofilefieldvalue"."user_profile_id" = "zerver_userprofile"."id")
    INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id")
    WHERE "zerver_userprofile"."realm_id" = 2

After:

    SELECT ...
    FROM "zerver_customprofilefieldvalue"
    INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id")
    WHERE "zerver_customprofilefield"."realm_id" = 2'

I don't have any way to measure the two queries with
realistic data, but I would assume the second
query is significantly faster on most of our instances,
since CustomProfileField should be tiny.
2020-02-09 22:04:02 -08:00
Steve Howell
01f180d042 minor: Remove unused line of code in get_raw_user_data().
The line removed here is a noop, as both sides of the
immediately following conditional reassign the
same variable.

This harmless cruft was the result of the recent commit
1ae5964ab8, which added
support for single-user GETs.
2020-02-09 22:04:02 -08:00
Vishnu KS
4572be8c27 api: Rename subject_links to topic_links.
Fixes #13588
2020-02-07 14:35:22 -08:00
Tim Abbott
84edb5c516 test_fixtures: Fix buggy reuse of status_dir between databases.
Apparently, the arguments passed to template_database_status were
incorrect for the manual testing development database, in that we
didn't pass a status_dir when calling into that code from provision.

The result was that provisioning before running `test-backend` would
ignore changes to the list of check_files (etc.) made after rebasing,
and vice versa.

The cleanest fix is to compute status_dir from other values passed in;
I'm also going to open a follow-up issue for creating a better overall
interface here.
2020-02-07 13:33:08 -08:00
akashaviator
1ae5964ab8 api: Add an api endpoint for GET /users/{id}
This adds a new API endpoint for querying basic data on a single other
user in the organization, reusing the existing infrastructure (and
view function!) for getting data on all users in an organization.

Fixes #12277.
2020-02-07 10:36:31 -08:00
Tim Abbott
e39840c705 users: Add read-only mode for access_user_by_id.
We've be using this in the upcoming GET /users/{id} method.
2020-02-07 10:36:31 -08:00
Tim Abbott
aa9286a1f9 users: Move query into caller of get_custom_profile_field_values.
This will be useful for supporting a smaller query for a single user.
2020-02-07 10:36:31 -08:00
Tim Abbott
79e5dd1374 users: Rename get_raw_user_data user parameter to acting_user.
This is for improved clarity as we extend this function to take
multiple user objects.
2020-02-07 10:36:31 -08:00
Steve Howell
7e99e7feb2 presence: Extract get_legacy_user_info.
This code is a bit flatter and just preps the data
for a single user.  There is never any interaction
between the data for user A and user B, so we can
mostly avoid complicated nested data structures
and do most of the data-crunching on a per-user basis.

We also do an explicit sort of the data before
running it through groupby.  The explicit sort
simplifies how we calculate `most_recent_info`
and also avoids needing to add `dt` to an intermediate
data structure.

Finally, when it comes to the individual client data,
the code has relied on the assumption that there is
only one row per client, which I believe to be true,
but now the code is more explicit about that.
2020-02-06 17:16:22 -08:00
Steve Howell
bf3baa14ac presence: Rename get_status_dict_by_user(). 2020-02-06 17:16:22 -08:00
Steve Howell
675f8514e8 presence: Rename get_status_dict().
We renamed this to get_presences_for_realm(),
and we have the caller pass in realm, not
user_profile.
2020-02-06 17:16:22 -08:00
Steve Howell
363e6bf239 presence: Move get_status_dicts_for_rows(). 2020-02-06 17:16:22 -08:00
Steve Howell
36fba1076f presence: Move get_status_dict_by_user. 2020-02-06 17:16:22 -08:00
Steve Howell
6f027d84a9 presence: Move get_status_dict_by_realm. 2020-02-06 17:16:22 -08:00
Steve Howell
703338dfa3 presence: Extract lib/presence.py.
This will make more sense when we pull some
code out of the model.
2020-02-06 17:16:22 -08:00
Tim Abbott
7c0a98754a home: Refactor logic for show_invites and show_add_streams. 2020-02-05 16:05:02 -08:00
Tim Abbott
7032f49f8e exceptions: Move default json_unauthorized string to response.py.
This small refactor should make it easier to reuse this exception for
other situations as well.
2020-02-05 15:40:10 -08:00
Anders Kaseorg
8e5a45267d test_classes: Use a valid (but reserved as fictional) phone number.
django-phonenumber-field 2.4.0 adds tighter phone number validation
that rejects +12223334444 for having an invalid area code.  This was
reverted in 4.0.0, but django-two-factor-auth still requires <3.99.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-02-05 12:38:10 -08:00
Ryan Rehman
174b2abcfd settings: Migrate to stream_post_policy structure.
This commit includes a new `stream_post_policy` setting,
by replacing the `is_announcement_only` field from the Stream model,
which is done by mirroring the structure of the existing
`create_stream_policy`.

It includes the necessary schema and database migrations to migrate
the is_announcement_only boolean field to stream_post_policy,
a smallPositiveInteger field similar to many other settings.

This change is done to allow organization administrators to restrict
new members from creating and posting to a stream. However, this does
not affect admins who are new members.

With many tweaks by tabbott to documentation under /help, etc.

Fixes #13616.
2020-02-04 17:08:08 -08:00
Mateusz Mandera
30d02c2e2c test_fixtures: app_label should be a positional arg in call_command.
We were incorrectly passing it as a kwarg, which would cause an
exception on Django 2.
2020-02-04 12:46:53 -08:00
Mateusz Mandera
0e7c97378e is_safe_url: Use allowed_hosts instead of depreciated host argument.
Judging by comparing django 1.11 with django 2.2 code of this function,
this shouldn't change any behavior.
2020-02-04 12:46:53 -08:00
Steve Howell
e3ad9baf1d presence: Add process_presence_event.
This lets us conditionally remove the email
field from a presence event if the client
has registered with the slim_presence flag.
2020-02-04 12:30:36 -08:00
Steve Howell
9847d4d9a3 refactor: Use user_id in get_status_dict_by_user.
This avoids a needless user lookup in apply_event.
2020-02-04 12:30:36 -08:00
Steve Howell
a672a00677 presence: Add user_id to presence event.
In a later commit, we will eliminate email for
clients who have set slim_presence as their
preference.
2020-02-04 12:30:36 -08:00
Steve Howell
bf9144ff69 presence: Add slim_presence flag.
This flag affects page_params and the
payload you get back from POSTs to this
url:

    users/me/presence

The flag does not yet affect the
presence events that get sent to a
client.
2020-02-04 12:30:34 -08:00
Vishnu Ks
5dfd4ea38d export: Remove unused parameter from _get_exported_s3_record. 2020-02-03 14:09:05 -08:00
Vishnu Ks
5a59bf329e import: Skip setting user_profile_id metadata only if unavailable. 2020-02-03 14:09:05 -08:00